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Preface 


This  is  a  reworking  of  my  earlier  book  “Cryptography:  An  Introduction”  which  has  been 
available  online  for  over  a  decade.  In  the  intervening  years  there  have  been  major  advances  and 
changes  in  the  subject  which  have  led  me  to  revisit  much  of  the  material  in  this  book.  In  the  main  the 
book  remains  the  same,  in  that  it  tries  to  present  a  non-rigorous  treatment  of  modern  cryptography, 
which  is  itself  a  highly  rigorous  area  of  computer  science/mathematics.  Thus  the  book  acts  as  a 
stepping  stone  between  more  “traditional”  courses  which  are  taught  to  undergraduates  around  the 
world,  and  the  more  advanced  rigorous  courses  taught  in  graduate  school. 

The  motivation  for  such  a  bridging  book  is  that,  in  my  view,  the  traditional  courses  (which  deal 
with  basic  RSA  encryption  and  signatures,  and  perhaps  AES)  are  not  a  suitable  starting  point. 
They  do  not  emphasize  the  importance  of  what  it  means  for  a  system  to  be  secure;  and  are  often 
introduced  into  a  curriculum  as  a  means  of  demonstrating  the  applicability  of  mathematical  theory 
as  opposed  to  developing  the  material  as  a  subject  in  its  own  right.  However,  most  undergraduates 
could  not  cope  with  a  full-on  rigorous  treatment  from  the  start.  After  all  one  first  needs  to  get  a 
grasp  of  basic  ideas  before  one  can  start  building  up  a  theoretical  edifice. 

The  main  differences  between  this  version  and  the  Third  Edition  of  “Cryptography:  An  Intro¬ 
duction”  is  in  the  ordering  of  material.  Now  security  definitions  are  made  central  to  the  discussion 
of  modern  cryptography,  and  all  discussions  of  attacks  and  weaknesses  are  related  back  to  these 
definitions.  We  have  found  this  to  be  a  good  way  of  presenting  the  material  over  the  last  few  years 
in  Bristol;  hence  the  reordering.  In  addition  many  topics  have  been  updated,  and  explanations 
improved.  I  have  also  made  a  number  of  the  diagrams  more  pleasing  to  the  eye. 

Cryptography  courses  are  now  taught  at  all  major  universities;  sometimes  these  are  taught 
in  the  context  of  a  Mathematics  degree,  sometimes  in  the  context  of  a  Computer  Science  degree, 
and  sometimes  in  the  context  of  an  Electrical  Engineering  degree.  Indeed,  a  single  course  often 
needs  to  meet  the  requirements  of  all  three  types  of  students,  plus  maybe  some  from  other  subjects 
who  are  taking  the  course  as  an  “open  unit”.  The  backgrounds  and  needs  of  these  students  are 
different;  some  will  require  a  quick  overview  of  the  algorithms  currently  in  use,  whilst  others  will 
want  an  introduction  to  current  research  directions.  Hence,  there  seems  to  be  a  need  for  a  textbook 
which  starts  from  a  low  level  and  builds  confidence  in  students  until  they  are  able  to  read  the  texts 
mentioned  at  the  end  of  this  Preface. 

The  background  I  assume  is  what  one  could  expect  of  a  third  or  fourth  year  undergraduate 
in  computer  science.  One  can  assume  that  such  students  have  already  met  the  basics  of  discrete 
mathematics  (modular  arithmetic)  and  a  little  probability.  In  addition,  they  will  have  at  some  point 
done  (but  probably  forgotten)  elementary  calculus.  Not  that  one  needs  calculus  for  cryptography, 
but  the  ability  to  happily  deal  with  equations  and  symbols  is  certainly  helpful.  Apart  from  that  I 
introduce  everything  needed  from  scratch.  For  those  students  who  wish  to  dig  into  the  mathematics 
a  little  more,  or  who  need  some  further  reading,  I  have  provided  an  appendix  which  covers  most  of 
the  basic  algebra  and  notation  needed  to  cope  with  modern  cryptosystems. 

It  is  quite  common  for  computer  science  courses  not  to  include  much  of  complexity  theory  or 
formal  methods.  Many  such  courses  are  based  more  on  software  engineering  and  applications  of 
computer  science  to  areas  such  as  graphics,  vision  or  artificial  intelligence.  The  main  goal  of  such 
courses  is  to  train  students  for  the  workplace  rather  than  to  delve  into  the  theoretical  aspects  of 
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the  subject.  Hence,  I  have  introduced  what  parts  of  theoretical  computer  science  I  need,  as  and 
when  required. 

I  am  not  mathematically  rigorous  at  all  steps,  given  the  target  audience,  but  aim  to  give  a 
flavour  of  the  mathematics  involved.  For  example  I  often  only  give  proof  outlines,  or  may  not 
worry  about  the  success  probabilities  of  many  of  the  reductions.  I  try  to  give  enough  of  the  gory 
details  to  demonstrate  why  a  protocol  or  primitive  has  been  designed  in  a  certain  way.  Readers 
wishing  for  a  more  in-depth  study  of  the  various  points  covered  or  a  more  mathematically  rigorous 
coverage  should  consult  one  of  the  textbooks  or  papers  in  the  Further  Reading  sections  at  the  end 
of  each  chapter. 

On  the  other  hand  we  use  the  terminology  of  groups  and  finite  fields  from  the  outset.  This  is  for 
two  reasons.  Firstly,  it  equips  students  with  the  vocabulary  to  read  the  latest  research  papers,  and 
hence  enables  students  to  carry  on  their  studies  at  the  research  level.  Secondly,  students  who  do 
not  progress  to  study  cryptography  at  the  postgraduate  level  will  find  that  to  understand  practical 
issues  in  the  “real  world” ,  such  as  API  descriptions  and  standards  documents,  a  knowledge  of  this 
terminology  is  crucial.  We  have  taken  this  approach  with  our  students  in  Bristol,  who  do  not  have 
any  prior  exposure  to  this  form  of  mathematics,  and  find  that  it  works  well  as  long  as  abstract 
terminology  is  introduced  alongside  real-world  concrete  examples  and  motivation. 

I  have  always  found  that  when  reading  protocols  and  systems  for  the  first  time  the  hardest  part 
is  to  work  out  what  is  public  information  and  which  information  one  is  trying  to  keep  private.  This 
is  particularly  true  when  one  meets  a  public  key  encryption  algorithm  for  the  first  time,  or  one  is 
deciphering  a  substitution  cipher.  Hence  I  have  continued  with  the  colour  coding  from  the  earlier 
book.  Generally  speaking  items  in  red  are  secret  and  should  never  be  divulged  to  anyone.  Items  in 
blue  are  public  information  and  are  known  to  everyone,  or  are  known  to  the  party  one  is  currently 
pretending  to  be. 

For  example,  suppose  one  is  trying  to  break  a  system  and  recover  some  secret  message  m; 
suppose  the  attacker  computes  some  quantity  b.  Here  the  red  refers  to  the  quantity  the  attacker 
does  not  know  and  blue  refers  to  the  quantity  the  attacker  does  know.  If  one  is  then  able  to  write 
down,  after  some  algebra, 

&=•••  =  rn, 

then  it  is  clear  something  is  wrong  with  our  cryptosystem.  The  attacker  has  found  out  something  he 
should  not.  This  colour  coding  will  be  used  at  all  places  where  it  adds  something  to  the  discussion. 
In  other  situations,  where  the  context  is  clear  or  all  data  is  meant  to  be  secret,  I  do  not  bother 
with  the  colours. 

To  aid  self-study  each  chapter  is  structured  as  follows: 

•  A  list  of  items  the  chapter  will  cover,  so  you  know  what  you  will  be  told  about. 

•  The  actual  chapter  contents. 

•  A  summary  of  what  the  chapter  contains.  This  will  be  in  the  form  of  revision  notes:  if 
you  wish  to  commit  anything  to  memory  it  should  be  these  facts. 

•  Further  Reading.  Each  chapter  contains  a  list  of  a  few  books  or  papers  from  which  further 
information  can  be  obtained.  Such  pointers  are  mainly  to  material  which  you  should  be 
able  to  tackle  given  that  you  have  read  the  prior  chapter. 

There  are  no  references  made  to  other  work  in  this  book;  it  is  a  textbook  and  I  did  not  want 
to  break  the  flow  with  references  to  this,  that  and  the  other.  Therefore,  you  should  not  assume 
that  ANY  of  the  results  in  this  book  are  my  own;  in  fact  NONE  are  my  own.  Those  who  wish  to 
obtain  pointers  to  the  literature  should  consult  one  of  the  books  mentioned  in  the  Further  Reading 
sections. 

The  book  is  clearly  too  large  for  a  single  course  on  cryptography;  this  gives  the  instructor  using 
the  book  a  large  range  of  possible  threads  through  the  topics.  For  a  traditional  cryptography  course 
within  a  Mathematics  department  I  would  recommend  Chapters  1,  2,  3,  7,  11,  12,  13,  14,  15,  16 
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and  17.  For  a  course  in  a  Computer  Science  department  I  would  recommend  Chapters  1,  11,  12, 
13,  14,  15  and  16,  followed  by  a  selection  from  18,  19,  20,  21  and  22.  In  any  course  I  strongly 
recommend  the  material  in  Chapter  11  should  be  covered.  This  is  to  enable  students  to  progress  to 
further  study,  or  to  be  able  to  deal  with  the  notions  which  occur  when  using  cryptography  in  the 
real  world.  The  other  chapters  in  this  book  provide  additional  supplementary  material  on  historical 
matters,  implementation  aspects,  or  act  as  introductions  to  topics  found  in  the  recent  literature. 

Special  thanks  go  to  the  following  people  (whether  academics,  students  or  industrialists)  for  pro¬ 
viding  input  over  the  years  on  the  various  versions  of  the  material:  Nils  Anderson,  Endre  Bangerter, 
Guy  Barwell,  David  Bernhard,  Dan  Bernstein,  Ian  Blake,  Cohn  Boyd,  Sergiu  Bursuc,  Jiun-Ming 
Chen,  Joan  Daemen,  Ivan  Damgard,  Gareth  Davies,  Reza  Rezaeian  Farashahi,  Ed  Geraghty,  Flo- 
rian  Hess,  Nick  Howgrave- Graham,  Ellen  Jochemsz,  Thomas  Johansson,  Georgios  Kafanas,  Parimal 
Kumar,  Jake  Longo  Galea,  Eugene  Luks,  Vadim  Lyubashevsky,  David  McCann,  Bruce  McIntosh, 
John  Malone-Lee,  Wenbo  Mao,  Dan  Martin,  John  Merriman,  Phong  Nguyen,  Emmanuela  Orsini, 
Dan  Page,  Christopher  Peikert,  Joop  van  de  Pol,  David  Rankin,  Vincent  Rijmen,  Ron  Rivest, 
Michal  Rybar,  Berry  Schoenmakers,  Tom  Shrimpton,  Martijn  Stam,  Ryan  Stanley,  Damien  Stehle, 
Edlyn  Teske,  Susan  Thomson,  Frederik  Vercauteren,  Bogdan  Warinschi,  Carolyn  Whitnall,  Steve 
Williams  and  Marcin  Wojcik. 

Nigel  Smart 
University  of  Bristol 


Further  Reading 

After  finishing  this  book  if  you  want  to  know  more  technical  details  then  I  would  suggest  the 
following  books: 

A.  J.  Menezes,  P.  van  Oorschot  and  S.A.  Vanstone.  The  Handbook  of  Applied  Cryptography.  CRC 
Press,  1997. 


J.  Katz  and  Y.  Lindell.  Introduction  to  Modern  Cryptography:  Principles  and  Protocols.  CRC 
Press,  2007. 
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Part  1 

Mathematical  Background 


Before  we  tackle  cryptography  we  need  to  cover  some  basic  facts  from  mathematics.  Much  of 
the  following  can  be  found  in  a  number  of  university  “Discrete  Mathematics”  courses  aimed  at 
Computer  Science  or  Engineering  students,  hence  one  hopes  not  all  of  this  section  is  new.  This 
part  is  mainly  a  quick  overview  to  allow  you  to  start  on  the  main  contents,  hence  you  may  want  to 
first  start  on  Part  2  and  return  to  Part  1  when  you  meet  some  concept  you  are  not  familiar  with. 
However,  I  would  suggest  reading  Section  2.2  of  Chapter  2  and  Section  3.1  of  Chapter  3  at  least, 
before  passing  on  to  the  rest  of  the  book.  For  those  who  want  more  formal  definitions  of  concepts, 
there  is  the  appendix  at  the  end  of  the  book. 


CHAPTER  1 


Modular  Arithmetic,  Groups,  Finite  Fields  and  Probability 


Chapter  Goals 


•  To  understand  modular  arithmetic. 

•  To  become  acquainted  with  groups  and  finite  fields. 

•  To  learn  about  basic  techniques  such  as  Euclid’s  algorithm,  the  Chinese  Remainder  The¬ 
orem  and  Legendre  symbols. 

•  To  recap  basic  ideas  from  probability  theory. 


1.1.  Modular  Arithmetic 

Much  of  this  book  will  be  spent  looking  at  the  applications  of  modular  arithmetic,  since  it  is 
fundamental  to  modern  cryptography  and  public  key  cryptosystems  in  particular.  Hence,  in  this 
chapter  we  introduce  the  basic  concepts  and  techniques  we  shall  require. 

The  idea  of  modular  arithmetic  is  essentially  very  simple  and  is  identical  to  the  “clock  arith¬ 
metic”  you  learn  in  school.  For  example,  converting  between  the  24- hour  and  the  12-hour  clock 
systems  is  easy.  One  takes  the  value  in  the  24- hour  clock  system  and  reduces  the  hour  by  12.  For 
example  13:00  in  the  24-hour  clock  system  is  one  o’clock  in  the  12-hour  clock  system,  since  13 
modulo  12  is  equal  to  one. 

More  formally,  we  fix  a  positive  integer  N  which  we  call  the  modulus.  For  two  integers  a  and 
b  we  write  a  =  b  (mod  N)  if  N  divides  b  —  a,  and  we  say  that  a  and  b  are  congruent  modulo  N . 

Often  we  are  lazy  and  just  write  a  =  £>,  if  it  is  clear  we  are  working  modulo  N . 

We  can  also  consider  (mod  N)  as  a  postfix  operator  on  an  integer  which  returns  the  smallest 
non-negative  value  equal  to  the  argument  modulo  N .  For  example 

18  (mod  7)  =  4, 

—  18  (mod  7)  =  3. 

The  modulo  operator  is  like  the  C  operator  %,  except  that  in  this  book  we  usually  take  represen¬ 
tatives  which  are  non-negative.  For  example  in  C  or  Java  we  have, 

(-3)0/o2  =  -1 

whilst  we  shall  assume  that  (—3)  (mod  2)  =  1. 

For  convenience  we  define  the  set 

Z/NZ  =  {0,...,  N  -  1} 

as  the  set  of  remainders  modulo  N.  This  is  the  set  of  values  produced  by  the  postfix  operator 
(mod  N).  Note,  some  authors  use  the  alternative  notation  of  Z n  for  the  set  Z/7VZ,  however,  in  this 
book  we  shall  stick  to  Z/NZ.  For  any  set  S  we  let  //S  denote  the  number  of  elements  in  the  set 
5,  thus  #(Z/7VZ)  =  N. 
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The  set  Z/TVZ  has  two  basic  operations  on  it,  namely  addition  and  multiplication.  These  are 
defined  in  the  obvious  way,  for  example: 

(11  +  13)  (mod  16)  =  24  (mod  16)  =  8 

since  24  =  1  •  16  +  8  and 

(11  •  13)  (mod  16)  =  143  (mod  16)  =  15 

since  143  =  8  •  16  +  15. 

1.1.1.  Groups:  Addition  and  multiplication  modulo  TV  work  almost  the  same  as  arithmetic  over 
the  reals  or  the  integers.  In  particular  we  have  the  following  properties: 

(1)  Addition  is  closed: 

Va,  b  G  Z/TVZ  :  a  +  b  G  Z/TVZ. 

(2)  Addition  is  associative: 

Va,  5,  c  G  Z/TVZ  :  (a  +  b)  +  c  =  a  +  (b  +  c). 

(3)  0  is  an  additive  identity: 

Va  G  Z/TVZ  :  a  +  0  =  0  +  a  =  a. 

(4)  The  additive  inverse  always  exists: 

Va  G  Z/TVZ  :  a  +  (TV  —  a)  =  (TV  —  a)  +  a  =  0, 

i.e.  —a  is  an  element  which  when  combined  with  a  produces  the  additive  identity. 

(5)  Addition  is  commutative: 

Va,  b  G  Z/TVZ  :  a  +  b  =  b  +  a. 

(6)  Multiplication  is  closed: 

Va,  b  G  Z/TVZ  :  a  •  b  G  Z/TVZ. 

(7)  Multiplication  is  associative: 

Va,  5,  c  G  Z/TVZ  :  (a  -  b)  -  c  =  a  •  {b  •  c). 

(8)  1  is  a  multiplicative  identity: 

Va  G  Z/TVZ  :  a  •  1  =  1  •  a  =  a. 

(9)  Multiplication  and  addition  satisfy  the  distributive  law: 

Va,  5,  c  G  Z/TVZ  :  (a  +  5)  •  c  =  a  •  c  +  b  •  c. 

(10)  Multiplication  is  commutative: 

Va,  b  G  Z/TVZ  :  a  •  b  =  b  •  a. 

Many  of  the  sets  we  will  encounter  have  a  number  of  these  properties,  so  we  give  special  names  to 
these  sets  as  a  shorthand. 

Definition  1.1  (Groups).  A  group  is  a  set  with  an  operation  on  its  elements  which 

•  Is  closed, 

•  Has  an  identity, 

•  Is  associative,  and 

•  Every  element  has  an  inverse. 

A  group  which  is  commutative  is  often  called  abelian.  Almost  all  groups  that  one  meets  in  cryp¬ 
tography  are  abelian,  since  the  commutative  property  is  often  what  makes  them  cryptographically 
interesting.  Hence,  any  set  with  properties  1,  2,  3  and  4  above  is  called  a  group,  whilst  a  set  with 
properties  1,  2,  3,  4  and  5  is  called  an  abelian  group.  Standard  examples  of  groups  which  one  meets 
all  the  time  in  high  school  are: 
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•  The  integers,  the  reals  or  the  complex  numbers  under  addition.  Here  the  identity  is  0  and 
the  inverse  of  x  is  —  x,  since  x  +  (— x)  =  0. 

•  The  non-zero  rational,  real  or  complex  numbers  under  multiplication.  Here  the  identity  is 
1  and  the  inverse  of  x  is  denoted  by  x_1,  since  x  •  x~l  =  1. 

A  group  is  called  multiplicative  if  we  tend  to  write  its  group  operation  in  the  same  way  as  one  does 
for  multiplication,  i.e. 

/  =  9'h  and  g5  =  g  ■  g  ■  g  ■  g  ■  g. 

We  use  the  notation  (G,  •)  in  this  case  if  there  is  some  ambiguity  as  to  which  operation  on  G  we 
are  considering.  A  group  is  called  additive  if  we  tend  to  write  its  group  operation  in  the  same  way 
as  one  does  for  addition,  i.e. 

f  =  g  +  h  and  5  •  g  =  g  +  g  +  g  +  g  +  g. 

In  this  case  we  use  the  notation  (G,  +)  if  there  is  some  ambiguity.  An  abelian  group  is  called  cyclic 
if  there  is  a  special  element,  called  the  generator ,  from  which  every  other  element  can  be  obtained 
either  by  repeated  application  of  the  group  operation,  or  by  the  use  of  the  inverse  operation.  For 
example,  in  the  integers  under  addition  every  positive  integer  can  be  obtained  by  repeated  addition 
of  1  to  itself,  e.g.  7  can  be  expressed  by 

7=1  +  1  +  1  +  1  +  1  +  1  +  1. 

Every  negative  integer  can  be  obtained  from  a  positive  integer  by  application  of  the  additive  inverse 
operator,  which  sends  x  to  —x.  Hence,  we  have  that  1  is  a  generator  of  the  integers  under  addition. 

If  g  is  a  generator  of  the  cyclic  group  G  we  often  write  G  =  (g).  If  G  is  multiplicative  then 
every  element  h  of  G  can  be  written  as 

h  =  gx, 

whilst  if  G  is  additive  then  every  element  h  of  G  can  be  written  as 

h  =  x  •  g, 

where  x  in  both  cases  is  some  integer  called  the  discrete  logarithm  of  h  to  the  base  g. 

1.1.2.  Rings:  As  well  as  groups  we  also  use  the  concept  of  a  ring. 

Definition  1.2  (Rings).  A  ring  is  a  set  with  two  operations,  usually  denoted  by  +  and  •  for 
addition  and  multiplication,  which  satisfies  properties  1  to  9  above .  We  can  denote  a  ring  and  its 
two  operations  by  the  triple  (R,  •,+).  If  it  also  happens  that  multiplication  is  commutative  we  say 
that  the  ring  is  commutative. 

This  may  seem  complicated  but  it  sums  up  the  type  of  sets  one  deals  with  all  the  time,  for  example 
the  infinite  commutative  rings  of  integers,  real  or  complex  numbers.  In  fact  in  cryptography  things 
are  even  easier  since  we  only  need  to  consider  finite  rings,  like  the  commutative  ring  of  integers 
modulo  N ,  Z/fVZ.  Thus  Z/iVZ  is  an  abelian  group  when  we  only  think  of  addition,  but  it  is  also 
a  ring  if  we  want  to  worry  about  multiplication  as  well. 

1.1.3.  Euler’s  <f>  Function:  In  modular  arithmetic  it  will  be  important  to  know  when,  given  a 
and  b ,  the  equation 

a  •  x  =  b  (mod  N) 

has  a  solution.  For  example  there  is  exactly  one  solution  in  the  set  Z/143Z  =  {0, . . . ,  142}  to  the 
equation 

7  •  x  =  3  (mod  143), 
but  there  are  no  solutions  to  the  equation 


11  -x  =  3 


(mod  143), 
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however  there  are  11  solutions  to  the  equation 

11 -a  =  22  (mod  143). 

Luckily,  it  is  very  easy  to  test  when  such  an  equation  has  one,  many  or  no  solutions.  We  simply 
compute  the  greatest  common  divisor,  or  gcd,  of  a  and  N ,  i.e.  gcd(a,  N). 

•  If  gcd  (a,  N)  =  1  then  there  is  exactly  one  solution.  We  find  the  value  c  such  that  a  •  c  =  1 
(mod  N)  and  then  we  compute  x  <—  b  •  c  (mod  N). 

•  If  g  =  gcd(a,  N )  7^  1  and  gcd(a,  N)  divides  b  then  there  are  g  solutions.  Here  we  divide 
the  whole  equation  by  g  to  produce  the  equation 

a  •  x  =  b'  (mod  TV7), 

where  a'  =  a/g ,  b'  =  b/g  and  N'  =  N/g.  If  x'  is  a  solution  to  the  above  equation  then 

x  <—  x'  +  i  •  N' 


for  0  <  i  <  g  is  a  solution  to  the  original  one. 

•  Otherwise  there  are  no  solutions. 

The  case  where  gcd(a,  N)  =  1  is  so  important  we  have  a  special  name  for  it:  we  say  a  and  N  are 
relatively  prime  or  coprime. 

In  the  above  description  we  wrote  x  <—  y  to  mean  that  we  assign  x  the  value  y\  this  is  to 
distinguish  it  from  saying  x  =  y,  by  which  we  mean  x  and  y  are  equal.  Clearly  after  assignment  of 
y  to  x  the  values  of  x  and  y  are  indeed  equal.  But  imagine  we  wanted  to  increment  x  by  one,  we 
would  write  x  <—  x  +  1,  the  meaning  of  which  is  clear.  Whereas  x  =  x  +  1  is  possibly  a  statement 
which  evaluates  to  false! 

Another  reason  for  this  special  notation  for  assignment  is  that  we  can  extend  it  to  algorithms, 
or  procedures.  So  for  example  x  A(z)  might  mean  we  assign  x  the  output  of  procedure  A  on 
input  of  z.  This  procedure  might  be  randomized,  and  in  such  a  case  we  are  thereby  assuming  an 
implicit  probability  distribution  of  the  output  x.  We  might  even  write  x  S  where  S  is  some 
set,  by  which  we  mean  we  assign  x  a  value  from  the  set  S  chosen  uniformly  at  random.  Thus  our 
original  x  y  notation  is  just  a  shorthand  for  x  M- 

The  number  of  integers  in  Z/WZ  which  are  relatively  prime  to  N  is  given  by  the  Euler  0 
function,  <p{N).  Given  the  prime  factorization  of  N  it  is  easy  to  compute  the  value  of  If  N 

has  the  prime  factorization 

n 

2=1 

then 

n 

4>(N)  =  IJp-'Tpi  -  !)• 

2=1 

Note,  the  last  statement  is  very  important  for  cryptography:  Given  the  factorization  of  N  it  is  easy 
to  compute  the  value  of  <fi(N).  The  most  important  cases  for  the  value  of  <fi(N)  in  cryptography 
are: 

(1)  If  p  is  prime  then 

Hp)  =  p- 1- 

(2)  If  p  and  q  are  both  prime  and  p  ^  q  then 


<t>(p-q)  =  (p- 1)0?- 1)- 
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1.1.4.  Multiplicative  Inverse  Modulo  N:  We  have  just  seen  that  when  we  wish  to  solve  equa¬ 
tions  of  the  form 

a  •  x  =  b  (mod  N) 

we  reduce  the  problem  to  the  question  of  examining  whether  a  has  a  multiplicative  inverse  modulo 
TV,  i.e.  whether  there  is  a  number  c  such  that 

a  '  c  —  c  '  a  —  1  (mod  N). 

Such  a  value  of  c  is  often  written  a-1.  Clearly  a-1  is  the  solution  to  the  equation 

a  •  x  =  1  (mod  N). 

Hence,  the  inverse  of  a  only  exists  when  a  and  N  are  coprime,  i.e.  gcd(a,  N)  =  1.  Of  particular 
interest  is  when  N  is  a  prime  p,  since  then  for  all  non-zero  values  of  a  E  Z/pZ  we  always  obtain  a 
unique  solution  to 

a  •  x  =  1  (mod  p). 

Hence,  if  p  is  a  prime  then  every  non-zero  element  in  Z/pZ  has  a  multiplicative  inverse.  A  ring  like 
Z/pZ  with  this  property  is  called  a  held. 

Definition  1.3  (Fields).  A  field  is  a  set  with  two  operations  (G,  •,  +)  such  that 

•  (G,  +)  is  an  abelian  group  with  identity  denoted  by  0, 

•  (G\{0},-)  is  an  abelian  group, 

•  (G,  •,+)  satisfies  the  distributive  law. 

Hence,  a  held  is  a  commutative  ring  for  which  every  non-zero  element  has  a  multiplicative  inverse. 
You  have  met  helds  before,  for  example  consider  the  inhnite  helds  of  rational,  real  or  complex 
numbers. 

1.1.5.  The  Set  (Z/NZ)*:  We  dehne  the  set  of  all  invertible  elements  in  Z/NZ  by 

(; Z/NZ)*  =  {xe  Z/NZ  :  gcd (x,N)  =  1}. 

The  *  in  A*,  for  any  ring  A,  refers  to  the  largest  subset  of  A  which  forms  a  group  under  multipli¬ 
cation.  Hence,  the  set  (Z/NZ)*  is  a  group  with  respect  to  multiplication  and  it  has  size  <f(N).  In 
the  special  case  when  N  is  a  prime  p  we  have 

(Z/pZ)*  =  {1, . . .  ,p  -  1} 

since  every  non-zero  element  of  Z/pZ  is  coprime  to  p.  For  an  arbitrary  held  F  the  set  F*  is  equal 
to  the  set  F  \  {0}.  To  ease  notation,  for  this  very  important  case,  we  dehne 

F p  =  Z/pZ  =  {0, . . .  ,p  —  1} 

and 

f;  =  (z/pzy  =  {i,...,p- 1}. 

The  set  ¥p  is  said  to  be  a  hnite  held  of  characteristic  p.  In  the  next  section  we  shall  discuss  a 
more  general  type  of  hnite  held,  but  for  now  recall  the  important  point  that  the  integers  modulo 
N  are  only  a  held  when  N  is  a  prime.  We  end  this  section  with  the  most  important  theorem  in 
elementary  group  theory. 

Theorem  1.4  (Lagrange’s  Theorem).  //(G,  •)  is  a  group  of  order  (size)  n  =  j/G  then  for  all  a  G  G 
we  have  an  —  1. 

So  if  x  G  (Z/NZ)*  then 

X</>(A0  —  i  (mod  N) 

since  =f(Z/NZ)*  =  <f(N).  This  leads  us  to  Fermat’s  Little  Theorem,  not  to  be  confused  with 
Fermat’s  Last  Theorem  which  is  something  entirely  different. 
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Theorem  1.5  (Fermat’s  Little  Theorem).  Suppose  p  is  a  prime  and  a  E  Z;  then 

op  =  a  (mod  p). 

Fermat’s  Little  Theorem  is  a  special  case  of  Lagrange’s  Theorem  and  will  form  the  basis  of  one  of 
the  primality  tests  considered  in  a  later  chapter. 


1.2.  Finite  Fields 

The  integers  modulo  a  prime  p  are  not  the  only  type  of  finite  held.  In  this  section  we  shall  introduce 
another  type  of  finite  held  which  is  particularly  important.  At  hrst  reading  you  may  wish  to  skip 
this  section.  We  shall  only  be  using  these  general  forms  of  hnite  helds  when  discussing  the  AES 
block  cipher,  stream  ciphers  based  on  linear  feedback  shift  registers  and  when  we  look  at  systems 
based  on  elliptic  curves. 

For  this  section  we  let  p  denote  a  prime  number.  Consider  the  set  of  polynomials  in  X  whose 
coefficients  are  elements  of  ¥p.  We  denote  this  set  FP[X],  which  forms  a  ring  with  the  natural 
dehnition  of  addition  and  multiplication  of  polynomials  modulo  p.  Of  particular  interest  is  the  case 
when  p  =  2,  from  which  we  draw  most  of  our  examples  in  this  section.  For  example,  in  F2[X]  we 
have 


(1  +  X  +  X2)  +  (X  +  X3)  =  1  +  X2  +  x3, 

(1  +  X  +  X2)  ■  (X  +  X3)  =  X  +  X2  +  X4  +  x5. 


Just  as  with  the  integers  modulo  a  number  N,  where  the  integers  modulo  N  formed  a  ring,  we  can 
take  a  polynomial  /(X )  and  then  the  polynomials  modulo  /(X )  also  form  a  ring.  We  denote  this 
ring  by 

FP[X]//(X)FP[X] 


or  more  simply 


Fp[X]/(/(X)). 


But  to  ease  notation  we  will  often  write  F p[X]/ f{X)  for  this  latter  ring.  When  /(X) 
p  =  2  we  have,  for  example, 


X4  + 1  and 


(1  +  A  +  X2)  •  (X  +  X3)  (mod  X4  +  1)  =  1  +  X2 


since 

X  +  X2  +  X4  +  X5  =  (X  +  1)  •  (X4  +  1)  +  (1  +  X2). 

When  checking  the  above  equation  you  should  remember  we  are  working  modulo  two. 


1.2.1.  Inversion  in  General  Finite  Fields:  Recall,  when  we  looked  at  the  integers  modulo  N 
we  looked  at  the  equation  a  •  x  =  b  (mod  X).  We  can  consider  a  similar  question  for  polynomials. 
Given  a,  b  and  /,  all  of  which  are  polynomials  in  FP[X],  does  there  exist  a  solution  a  to  the  equation 
a  •  a  =  b  (mod  /)?  With  integers  the  answer  depended  on  the  greatest  common  divisor  of  a  and 
/,  and  we  counted  three  possible  cases.  A  similar  three  cases  can  occur  for  polynomials,  with  the 
most  important  one  being  when  a  and  /  are  coprime  and  so  have  greatest  common  divisor  equal 
to  one. 

A  polynomial  is  called  irreducible  if  it  has  no  proper  factors  other  than  itself  and  the  constant 
polynomials.  Hence,  irreducibility  of  polynomials  is  the  same  as  primality  of  numbers.  Just  as  with 
the  integers  modulo  X,  when  X  was  prime  we  obtained  a  hnite  held,  so  when  /(X)  is  irreducible 
the  ring  F p[X\/ f(X)  also  forms  a  hnite  held. 
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1.2.2.  Isomorphisms  of  Finite  Fields:  Consider  the  case  p  =  2  and  the  two  different  irreducible 
polynomials 

fr=X7  +  X+l 

and 

h  =  Y7  +  Y3  +  l. 

Now,  consider  the  two  finite  fields 

Fi  =  F 2[A]//i(A)  and  F2  =  F 2[F]//2(F). 

These  both  consist  of  the  27  binary  polynomials  of  degree  less  than  seven.  Addition  in  these  two 
fields  is  identical  in  that  one  just  adds  the  coefficients  of  the  polynomials  modulo  two.  The  only 
difference  is  in  how  multiplication  is  performed 

(X3  +  1)  •  (X4  +  1)  (mod  /ipf))  =  X4  +  X3  +  X, 

(Y3  +  1)  •  (Y4  +  1)  (mod  f2(Y))  =  Y4. 

A  natural  question  arises  as  to  whether  these  fields  are  “really”  different,  or  whether  they  just 
“look”  different.  In  mathematical  terms  the  question  is  whether  the  two  fields  are  isomorphic.  It 
turns  out  that  they  are  isomorphic  if  there  is  a  map 

0  :  Fi  — »  F2, 

called  a  field  isomorphism,  which  satisfies 

cj)(a  +  fi)  =  0(o)  +  0(0), 

0(<a  •  0)  =  0(<a)  •  0(0). 

Such  an  isomorphism  exists  for  every  two  finite  fields  of  the  same  order,  although  we  will  not  show 
it  here.  To  describe  the  map  above  you  only  need  to  show  how  to  express  a  root  of  /2(F)  in  terms 
of  a  polynomial  in  the  root  of  /i(A),  with  the  inverse  map  being  a  polynomial  which  expresses  a 
root  of  fi(X)  in  terms  of  a  polynomial  in  the  root  of  /2(F),  i.e. 

F  =  gi(X)  =  X  +  X2  +  X3  +  A5, 

X=g2(Y)=Y5+Y4. 

Notice  that  g2(gi(X))  (mod  /1(A))  =  A,  that  f2(gi(X))  (mod  /1(A))  =  0  and  that  /1  (02(F)) 
(mod  /2(F))  =  0. 

One  can  show  that  all  finite  fields  of  the  same  characteristic  and  prime  are  isomorphic,  thus  we 
have  the  following. 

Theorem  1.6.  There  is  (up  to  isomorphism)  just  one  finite  field  of  each  prime  power  order. 

The  notation  we  use  for  these  fields  is  either  ¥q  or  GF(q ),  with  q  —  pd  where  d  is  the  degree 
of  the  irreducible  polynomial  used  to  construct  the  field;  we  of  course  have  ¥p  =  ¥p[X]/X.  The 
notation  GF(q )  means  the  Galois  field  of  q  elements,  in  honour  of  the  nineteenth  century  French 
mathematician  Galois.  Galois  had  an  interesting  life;  he  accomplished  his  scientific  work  at  an 
early  age  before  dying  in  a  duel. 

1.2.3.  Field  Towers  and  the  Frobenius  Map:  There  are  a  number  of  technical  definitions 
associated  with  finite  fields  which  we  need  to  cover.  A  subset  F  of  a  field  K  is  called  a  subfield  if  F 
is  a  field  with  respect  to  the  same  operations  for  which  A  is  a  field.  Each  finite  field  K  contains  a 
copy  of  the  integers  modulo  p  for  some  prime  p,  i.e.  ¥p  C  K.  We  call  this  prime  the  characteristic 
of  the  field,  and  often  write  this  as  char  K.  The  subfield  of  integers  modulo  p  of  a  finite  field  is 
called  the  prime  subfield. 
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There  is  a  map  T  called  the  pth  power  Frobenius  map  defined  for  any  finite  held  by 

r  Fg  — >  Fg 

\  a  i — >  ap 

where  p  is  the  characteristic  of  ¥q.  The  Frobenius  map  is  an  isomorphism  of  Fg  with  itself;  such  an 
isomorphism  is  called  an  automorphism.  An  interesting  property  is  that  the  set  of  elements  fixed 
by  the  Frobenius  map  is  the  prime  held,  i.e. 

{<a  G  F q  :  ap  =  a}  =  ¥p. 

Notice  that  this  is  a  kind  of  generalization  of  Fermat’s  Little  Theorem  to  hnite  helds.  For  any 
automorphism  y  of  a  hnite  held,  the  set  of  elements  hxed  by  y  is  a  held,  called  the  hxed  held  of  y. 
Hence  the  previous  statement  says  that  the  hxed  held  of  the  Frobenius  map  is  the  prime  held  ¥p. 

Not  only  does  ¥q  contain  a  copy  of  Fp  but  ¥pd  contains  a  copy  of  ¥pe  for  every  value  of  e  dividing 
d;  see  Figure  1.1  for  an  example.  In  addition  F^e  is  the  hxed  held  of  the  automorphism  <f>e,  i.e. 

{a  G  ¥pd  :  ap&  =  a}  =  ¥pe. 

If  we  dehne  ¥q  as  FP[X\/ f(X),  for  some  irreducible  polynomial  f(X)  with  pdeg^  =  q ,  then  another 


Fpl2 


Figure  1.1.  Example  tower  of  hnite  helds.  The  number  on  each  line  gives  the 
degree  of  the  subheld  within  the  larger  held 

way  of  thinking  of  ¥q  is  as  the  set  of  polynomials  of  degree  less  than  deg  /  in  a  root  of  f(X).  In 
other  words  let  a  be  a  “formal”  root  of  /(A),  then  we  dehne 

deg  /—I 

Fg  =  {  ^  ai-  a1  :  ai  G  Fp} 

4=0 

with  addition  being  addition  of  polynomials  modulo  p,  and  multiplication  being  polynomial  multi¬ 
plication  modulo  p,  subject  to  the  fact  that  f(a)  =  0.  To  see  why  this  amounts  to  the  same  object 
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take  two  polynomials  a(X)  and  b(X)  and  let  c(X)  =  a(X)  •  b(X)  (mod  Then  there  is  a 

polynomial  q(X)  such  that 

c(X)  =  a(X).b(X)  +  q(X).f(X), 

which  is  our  multiplication  method  given  in  terms  of  polynomials.  In  terms  of  a  root  a  of  f(X )  we 
note  that  we  have 


c(a)  =  a  (a)  •  b(a)  +  q(a)  •  /(a). 

=  a  (a)  •  b(a)  +  q(a)  •  0, 

=  a  (a)  •  b(a). 

Another  interesting  property  is  that  if  p  is  the  characteristic  of  Fg  then  if  we  take  any  element 
a  G  ¥q  and  add  it  to  itself  p  times  we  obtain  zero,  e.g.  in  F49  we  have 

A  +  X  +  X  +  X  +  X  +  X  +  A  =  7-  X  =  0  (mod  7). 

The  non-zero  elements  of  a  finite  held,  usually  denoted  F*,  form  a  cyclic  finite  abelian  group,  called 
the  multiplicative  group  of  the  finite  held.  We  call  a  generator  of  F*  a  primitive  element  in  the 
hnite  held.  Such  primitive  elements  always  exist,  and  indeed  there  are  <j)(q)  of  them,  and  so  the 
multiplicative  group  is  always  cyclic.  In  other  words  there  always  exists  an  element  g  E  ¥q  such 
that  every  non-zero  element  a  can  be  written  as 

a  =  gx 

for  some  integer  value  of  x. 

Example:  As  an  example  consider  the  held  of  eight  elements  dehned  by 

F23  =F2[X]/(X3  +  X  +  1). 

In  this  held  there  are  seven  non-zero  elements;  namely 

1 ,  cy,  cr  -(“  1 5  cy  5  cy  T  1  -)  ot  Tcyci  ~\~  ol  \ 

where  a  is  a  root  of  A3  +  X  +  1.  We  see  that  a  is  a  primitive  element  in  F23  since 

a 1  =  a, 

2  2 

a  =  a  , 

o 

OL  =  OL  1 , 

a4  =  a2  +  a, 

=  c\'  T  ol  T  I5 
a6  o?  T  1, 
a:7  =  1. 

Notice  that  for  a  prime  p  this  means  that  the  integers  modulo  p  also  have  a  primitive  element,  since 
Z/pZ  =  Fp  is  a  hnite  held. 


1.3.  Basic  Algorithms 

There  are  several  basic  numerical  algorithms  or  techniques  which  everyone  should  know  since  they 
occur  in  many  places  in  this  book.  The  ones  we  shall  concentrate  on  here  are 

•  Euclid’s  gcd  algorithm, 

•  The  Chinese  Remainder  Theorem, 

•  Computing  Jacobi  and  Legendre  symbols. 
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1.3.1.  Greatest  Common  Divisors:  In  the  previous  sections  we  said  that  when  trying  to  solve 

a  •  x  =  b  (mod  N) 

in  integers,  or 

a  •  a  =  b  (mod  /) 

for  polynomials  modulo  a  prime,  we  needed  to  compute  the  greatest  common  divisor.  This  was 
particularly  important  in  determining  whether  a  E  Z/iVZ  or  a  E  F p[X\/f  had  a  multiplicative 
inverse  or  not,  i.e.  gcd(a,  N)  =  1  or  gcd(a, /)  =  1.  We  did  not  explain  how  this  greatest  common 
divisor  is  computed,  neither  did  we  explain  how  the  inverse  is  to  be  computed  when  we  know  it 
exists.  We  shall  now  address  this  omission  by  explaining  one  of  the  oldest  algorithms  known  to 
man,  namely  the  Euclidean  algorithm. 

If  we  were  able  to  factor  a  and  N  into  primes,  or  a  and  /  into  irreducible  polynomials,  then 
computing  the  greatest  common  divisor  would  be  particularly  easy.  For  example  if  we  were  given 

a  =  230  895  588  646  864  =  24  ■  157  ■  45133, 
b  =  33  107  658  350  407  876  =  22  •  157  •  22693  •  4513, 

then  it  is  easy,  from  the  factorization,  to  compute  the  gcd  as 

gcd(a,  6)  =  22  ■  157  ■  4513  =  2  834 164. 

However,  factoring  is  an  expensive  operation  for  integers,  so  the  above  method  is  very  slow  for 
large  integers.  However,  computing  greatest  common  divisors  is  actually  easy  as  we  shall  now 
show.  Although  factoring  for  polynomials  modulo  a  prime  is  very  easy,  it  turns  out  that  almost 
all  algorithms  to  factor  polynomials  require  access  to  an  algorithm  to  compute  greatest  common 
divisors.  Hence,  in  both  situations  we  need  to  be  able  to  compute  greatest  common  divisors  without 
recourse  to  factoring. 


1.3.2.  The  Euclidean  Algorithm:  In  the  following  we  will  consider  the  case  of  integers  only;  the 
generalization  to  polynomials  is  easy  since  both  integers  and  polynomials  allow  Euclidean  division. 
For  integers  a  and  6,  Euclidean  division  is  the  operation  of  finding  q  and  r  with  0  <  r  <  \b\  such 
that 


a  =  q  •  b  +  r, 

i.e.  r  <—  a  (mod  b).  For  polynomials  /  and  g,  Euclidean  division  means  finding  polynomials  g,r 
with  0  <  deg  r  <  deg  g  such  that 

/  =  Q  ■  9  +  r. 

To  compute  the  gcd  of  ro  =  a  and  r\  =  b  we  compute  r2 ,  f'3 ,  .  by  r*+ 2  =  r,  (mod  until 

rTO+ 1  =  0,  so  we  have: 


r2  r0  ~  qi  •  n, 
n  <-  n  -  q2  ■  r2, 


Cm  t  Cm  2  Qm— 1  *  t'rn  —  1  • 

cm+i  0,  i.e.  rm  divides  rm_i. 

If  d  divides  a  and  b  then  d  divides  r2,  >’3 ,  r 4  and  so  on.  Hence 

gcd(a,  6)  =gcd(r0,ri)  =gcd(ri,r2)  =  •••  =  gcd(rm_i,  rm)  =  rm. 
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As  an  example  of  this  algorithm  we  can  show  that  3  =  gcd(21, 12).  Using  the  Euclidean  algorithm 
we  compute  gcd(21, 12)  in  the  steps 

gcd(21, 12)  =  gcd(21  (mod  12),  12) 

=  gcd(9, 12) 

=  gcd(12  (mod  9),  9) 

=  gcd(3, 9) 

=  gcd(9  (mod  3),  3) 

=  gcd(0,  3)  =  3. 

Or,  as  an  example  with  larger  numbers, 

gcd(l 426 668 559 730,  810  653  094  756)  =  gcd(810 653 094 756,  616  015  464  974), 

=  gcd(616  015  464  974,  194  637629  782), 

=  gcd(194  637629  782,  32102  575  628), 

=  gcd(32  102 575 628,  2 022  176 014), 

=  gcd(2  022  176  014,  1  769  935  418), 

=  gcd(l  769  935  418,  252  240  596), 

=  gcd(252  240  596,  4  251246), 

=  gcd(4  251  246,  1417082), 

=  gcd(l  417082,  0), 

=  1417082. 

The  Euclidean  algorithm  essentially  works  because  the  mapping 

(a,  b)  i — »  (a  (mod  5),  5), 

for  a  >  b  is  a  gcd-preserving  mapping,  i.e.  the  input  and  output  of  pairs  of  integers  from  the 
mapping  have  the  same  greatest  common  divisor.  In  computer  science  terms  the  greatest  common 
divisor  is  an  invariant  of  the  mapping.  In  addition  for  inputs  a,  b  >  0  the  algorithm  terminates 
since  the  mapping  produces  a  sequence  of  decreasing  non- negative  integers,  which  must  eventually 
end  up  with  the  smallest  value  being  zero. 

The  trouble  with  the  above  method  for  determining  a  greatest  common  divisor  is  that  com¬ 
puters  find  it  much  easier  to  add  and  multiply  numbers  than  to  take  remainders  or  quotients. 
Hence,  implementing  a  gcd  algorithm  with  the  above  gcd-preserving  mapping  will  usually  be  very 
inefficient.  Fortunately,  there  are  a  number  of  other  gcd-preserving  mappings:  For  example  the 
following  is  a  gcd-preserving  mapping  between  pairs  of  integers,  which  are  not  both  even, 

{((a  —  5)/2,5)  If  a  and  b  are  odd. 

(a/2,  b)  If  a  is  even  and  b  is  odd. 

(a,  5/2)  If  a  is  odd  and  b  is  even. 

Recall  that  computers  find  it  easy  to  divide  by  two,  since  in  binary  this  is  accomplished  by  a  cheap 
bit  shift  operation.  This  latter  mapping  gives  rise  to  the  binary  Euclidean  algorithm,  which  is  the 
one  usually  implemented  on  a  computer.  Essentially,  this  algorithm  uses  the  above  gcd-preserving 
mapping  after  first  removing  any  power  of  two  in  the  gcd.  Algorithm  1.1  explains  how  this  works, 
on  input  of  two  positive  integers  a  and  b. 
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Algorithm  1.1:  Binary  Euclidean  algorithm 

g  1* 

/*  Remove  powers  of  two  from  the  gcd  */ 
while  (a  mod  2  =  0)  and  ( b  mod  2  =  0)  do 
a  i —  a/2,  b  i —  6/2,  g  i —  2  •  g. 

/*  At  least  one  of  a  and  b  is  now  odd  */ 

while  a  /  0  do 

while  a  mod  2  =  0  do  a  <—  a/2, 
while  b  mod  2  =  0  do  6^—6/ 2. 

/*  Now  both  a  and  b  are  odd  */ 
if  a  >  b  then  a  <—  (a  —  b)/2. 
else  b  <—  (b  —  a)/ 2. 

return  g  •  b 


1.3.3.  The  Extended  Euclidean  Algorithm:  Using  the  Euclidean  algorithm  we  can  determine 
when  a  has  an  inverse  modulo  A  by  testing  whether 

gcd(a,  A )  =  1. 

But  we  still  do  not  know  how  to  determine  the  inverse  when  it  exists.  To  do  this  we  use  a  variant 
of  Euclid’s  gcd  algorithm,  called  the  extended  Euclidean  algorithm.  Recall  we  had 

n-2  =  Qi-i  •  u-i  +  n 

with  rm  =  gcd(ro,  rq).  Now  we  unwind  the  above  and  write  each  ?y,  for  i  >  2,  in  terms  of  a  and  b. 
So  we  have  the  identities 

V2  =  ro  -  qi  •  ri  =  a  -  q\  •  b 

rs  =  7*1  -  q2  •  V2  =  b  -  q2  •  (a  -  q\  •  b)  =  -<72  •  a  +  (1  +  q\  •  <72)  •  b 

r i— 2  Si— 2  '  R  T  ti— 2  ■  b 

t  1— 1  =  <s  ^ — 1  *  a  T  U— 1  *  b 

n  =  U_2  -  qi- 1  •  n_i 

a  •  (s^_2  Q.i—1  ’  1)  T  b  •  {ti— 2  Qi—l  ’  ti— 1) 


Pm  —  Sjyi  •  a  T  •  6. 

The  extended  Euclidean  algorithm  takes  as  input  a  and  b  and  outputs  values  rm,  sm  and  £m  such 
that 

rm  =  gcd(a,  6)  =  sm  •  a  +  tm  •  6. 

Hence,  we  can  now  solve  our  original  problem  of  determining  the  inverse  of  a  modulo  A,  when 
such  an  inverse  exists.  We  first  apply  the  extended  Euclidean  algorithm  to  a  and  b  =  N  so  as  to 
compute  d,x,y  such  that 

d  =  gcd(a,  N )  =  x  •  a  +  y  •  A '. 

This  algorithm  is  described  in  Algorithm  1.2.  The  value  d  will  be  equal  to  one,  as  we  have  assumed 
that  a  and  A  are  coprime.  Given  the  output  from  this  algorithm,  we  can  solve  the  equation  a-x  =  1 
(mod  A),  since  we  have  d  =  x-  a  +  i/-A  =  x-  a  (mod  A). 
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Algorithm  1.2:  Extended  Euclidean  algorithm 

S  i —  0,  s'  i —  1,  t  i —  1,  t '  i —  0,  V  i —  5,  v'  i —  CL. 

while  r  7^  0  do 
q  <—  \  r' /r\ . 

(r7,  r)  <—  (r,  r'  —  q  •  r). 

(V,  s)  (s,  s7  —  q  •  s). 

(t7,  t)  <—  (£,  £7  —  q  •  £). 

d  <—  r7,  x  «—  £,  y  <—  s. 

return  d,  x,  y. 


As  an  example  suppose  we  wish  to  compute  the  inverse  of  7  modulo  19.  We  first  set  ro  =  7  and 
ri  =  19  and  then  we  compute 


r2  <-  5  =  19  —  2  •  7 

r3  <-  2  =  7  —  5  =  7  —  (19  -  2  •  7)  =  -19  +  3  •  7 

r4  <_  1  =  5  —  2  •  2  =  (19  -  2  •  7)  -  2  •  (-19  +  3  •  7)  =  3  •  19  -  8  •  7. 


Hence, 


1  =  —8  •  7  (mod  19) 


and  so 

7-1  =  —8  =  11  (mod  19). 

Note,  a  binary  version  of  the  above  algorithm  also  exists.  We  leave  it  to  the  reader  to  work  out  the 
details  of  the  binary  version  of  the  extended  Euclidean  algorithm. 


1.3.4.  Chinese  Remainder  Theorem  (CRT):  The  Chinese  Remainder  Theorem,  or  CRT,  is 
also  a  very  old  piece  of  mathematics,  which  dates  back  at  least  2  000  years.  We  shall  use  the  CRT 
in  a  few  places,  for  example  to  improve  the  performance  of  the  decryption  operation  of  RSA  and 
in  a  number  of  other  protocols.  In  a  nutshell  the  CRT  states  that  if  we  have  the  two  equations 

x  =  a  (mod  N)  and  x  =  b  (mod  M) 

then  there  is  a  unique  solution  modulo  (M  •  N )  if  and  only  if  gcd(A,  M)  =  1.  In  addition  it  gives 
a  method  to  easily  find  the  solution.  For  example  if  the  two  equations  are  given  by 

x  =  4  (mod  7), 
x  =  3  (mod  5), 

then  we  have 

x  =  18  (mod  35). 

It  is  easy  to  check  that  this  is  a  solution,  since  18  (mod  7)  =  4  and  18  (mod  5)  =  3.  But  how  did 
we  produce  this  solution? 

We  shall  first  show  how  this  can  be  done  naively  from  first  principles  and  then  we  shall  give 
the  general  method.  We  have  the  equations 

x  =  4  (mod  7)  and  x  =  3  (mod  5). 

Hence  for  some  u  we  have 

x  =  4  +  7  •  u  and  x  =  3  (mod  5). 

Putting  these  latter  two  equations  together,  one  obtains 

4  T  7  •  u  =  3  (mod  5). 
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We  then  rearrange  the  equation  to  find 

2-r  =  7-  r  =  3  —  4  =  4  (mod  5). 

Now  since  gcd(2,  5)  =  1  we  can  solve  the  above  equation  for  u.  First  we  compute  2~1  (mod  5)  =  3, 
since  2  •  3  =  6  =  1  (mod  5).  Then  we  compute  the  value  ofR  =  2_1-4  =  3-  4  =  2  (mod  5).  Then 
substituting  this  value  of  u  back  into  our  equation  for  x  gives  the  solution 

x  =  4  +  7- r  =  4  +  7-  2  =  18. 


The  Chinese  Remainder  Theorem:  Two  Equations:  The  case  of  two  equations  is  so  impor¬ 
tant  we  now  give  a  general  formula.  We  assume  that  gcd(TV,  M)  =  1,  and  that  we  are  given  the 
equations 

x  =  a  (mod  M)  and  x  =  b  (mod  TV). 


We  first  compute 

T  i—  M~l  (mod  TV) 

which  is  possible  since  we  have  assumed  gcd(TV,  M)  =  1.  We  then  compute 

r  (b  —  a)  •  T  (mod  TV). 

The  solution  modulo  M  •  TV  is  then  given  by 

X  i —  CL  +  R  •  M. 

To  see  this  always  works  we  verify 

x  (mod  M )  =  a  +  u  •  M  (mod  M ) 

=  a, 

x  (mod  TV)  =  a  +  u  •  M  (mod  TV) 

=  a  +  (b  —  a)  •  T  •  M  (mod  TV) 

=  a  +  (b  —  a)  •  M~l  •  M  (mod  TV) 

=  a  +  (b  —  a)  (mod  TV) 

=  b. 


The  Chinese  Remainder  Theorem:  The  General  Case:  Now  we  turn  to  the  general  case  of 
the  CRT  where  we  consider  more  than  two  equations  at  once.  Let  mi, . . . ,  mr  be  pairwise  relatively 
prime  and  let  ai, . . . ,  ar  be  given.  We  want  to  find  x  modulo  M  =  mi  •  m2  •  •  •  mr  such  that 

x  =  ai  (mod  rrii)  for  all  i. 


The  Chinese  Remainder  Theorem  guarantees  a  unique  solution  given  by 

r 

x  ai  •  Mi  •  yi  (mod  M ) 

i=  1 


where 


Mi  M/rrii  and  yi  Mi  1  (mod  m^). 


As  an  example  suppose  we  wish  to  find  the  unique  x  modulo 


M  =  1001  =  7-11-13 


x  =  5  (mod  7), 
x  =  3  (mod  11), 
x  =  10  (mod  13). 


such  that 
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We  compute 

Mi  <—  143,  y\  5, 

M2  T-  91,  1/2  <—  4, 

M3  <(—77,  2/3  12. 

Then,  the  solution  is  given  by 

r 

x  <—  a*  •  Mi  •  yi  (mod  M) 

2=1 

=  715  •  5  +  364  •  3  +  924  •  10  (mod  1001) 
=  894. 


1.3.5.  The  Legendre  Symbol:  Let  p  denote  a  prime,  greater  than  two.  Consider  the  mapping 

F p  — >¥p 

a  1 — »  a2 . 


Since  —a  and  a  are  distinct  elements  of  ¥p  if  a  /  0  and  p  /  2,  and  because  (— a)2  =  a,  we  see 
that  the  mapping  a  1 — >  a2  is  exactly  two-to-one  on  the  non-zero  elements  of  Fp.  So  if  an  element 
x  in  Fp  has  a  square  root,  then  it  has  exactly  two  square  roots  (unless  x  =  0)  and  exactly  half  of 
the  elements  of  F*  are  squares.  The  set  of  squares  in  F*  are  called  the  quadratic  residues  and  they 
form  a  subgroup  of  order  (p  —  l)/2  of  the  multiplicative  group  F*.  The  elements  of  F*  which  are 
not  squares  are  called  the  quadratic  non-residues. 

To  make  it  easy  to  detect  squares  modulo  a  prime  p  we  define  the  Legendre  symbol 


This  is  defined  to  be  equal  to  0  if  p  divides  a,  equal  to  +1  if  a  is  a  quadratic  residue  and  equal  to 
—  1  if  a  is  a  quadratic  non-residue. 

Notice  that,  if  a  7^  0  is  a  square  then  it  has  order  dividing  (p  —  l)/2  since  there  is  an  s  such 
that  s2  =  a  and  s  has  order  dividing  (p  —  1)  (by  Lagrange’s  Theorem).  Hence  if  a  is  a  square  it 
must  have  order  dividing  (p—  l)/2,  and  so  (mod  p)  =  1.  However,  if  a  is  not  a  square  then 

by  the  same  reasoning  it  cannot  have  order  dividing  (p  —  l)/2.  We  then  have  that  a^-1^2  =  u  for 
some  u  which  will  have  order  2,  and  hence  u  =  —  1.  Putting  these  two  facts  together  implies  we 
can  easily  compute  the  Legendre  symbol,  via 


1 )/2  (modp). 

Using  the  above  formula  turns  out  to  be  a  very  inefficient  way  to  compute  the  Legendre  symbol. 
In  practice  one  uses  the  law  of  quadratic  reciprocity 


In  other  words  we  have 


Q 

p 


1)/4. 


If  p  =  q  =  3  (mod  4), 
Otherwise. 
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Using  this  law  with  the  following  additional  formulae  gives  rise  to  a  recursive  algorithm  for  the 
Legendre  symbol: 

q\  f  q  (mod  p)  \ 

p)  V  V  / 

q  •  r\  f  q\  ( rx 


(2) 


(3) 


(4) 


P 


P 


P 


(l)  =  (-l)fr2-1)/8- 


Assuming  we  can  factor,  we  can  now  compute  the  Legendre  symbol 

15\  /  3  \  /  5 


17 


17 

17 


17 

17 


3  )  V  5 

» a 

=  (-l)-(-l)3 

=  l. 


by  equation  (3) 

by  equation  (1) 

by  equation  (2) 
by  equation  (4) 


In  a  moment  we  shall  see  a  more  efficient  algorithm  which  does  not  require  us  to  factor  integers. 

1.3.6.  Computing  Square  Roots  Modulo  p:  Computing  square  roots  of  elements  in  F*  when 
the  square  root  exists  turns  out  to  be  an  easy  task.  Algorithm  1.3  gives  one  method,  called  Shanks’ 
Algorithm,  of  computing  the  square  root  of  a  modulo  p,  when  such  a  square  root  exists.  When 
p  =  3  (mod  4),  instead  of  the  Shank’s  algorithm,  we  can  use  the  following  formula 

x  <—  a^p+1^4  (mod  p), 

which  has  the  advantage  of  being  deterministic  and  more  efficient  than  the  general  method  of 
Shanks.  That  this  formula  works  is  because 


x 2  =  =  a^p  1^2  •  a  =  l  -  |  •  a  =  a 

p 


where  the  last  equality  holds  since  we  have  assumed  that  a  is  a  quadratic  residue  modulo  p  and  so 
it  has  Legendre  symbol  equal  to  one. 

1.3.7.  The  Jacobi  Symbol:  The  Legendre  symbol  above  is  only  defined  when  its  denominator  is 
a  prime,  but  there  is  a  generalization  to  composite  denominators  called  the  Jacobi  symbol  Suppose 
n  >  3  is  odd  and 


n 


e\  eo  eb 

Pi  'P2  mmmPk 


then  the  Jacobi  symbol 


a 


n 


is  defined  in  terms  of  the  Legendre  symbol  by 


a 


n 


a 


ei 


a 


62 


a 


T 1/  \P2j  \Pk 

The  Jacobi  symbol  can  be  computed  using  a  similar  method  to  the  Legendre  symbol  by  making 
use  of  the  identity,  derived  from  the  law  of  quadratic  reciprocity, 

2  Y 


a 


n 


n 


^  ^ R  (mod  a\ )  ^  ^ _ -|^(ai— i)-(n— 1)/4 
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Algorithm  1.3:  Shanks’  algorithm  for  extracting  a  square  root  of  a  modulo  p 
Choose  a  random  n  until  one  is  found  such  that 


Let  e,  q  be  integers  such  that  q  is  odd  and  p  —  1  =  2e  •  q. 
y  <—  nq  (mod  p). 
r  <—  e. 

x  <—  (mod  p). 

b  <—  a  •  x2  (mod  p). 
x  i —  a  '  x  (mod  p). 
while  b  7^  1  (mod  p)  do 

Find  the  smallest  m  such  that  b2™  =  1  (mod  p). 

t  <—  y2r  m  1  (mod  p). 
y  <—  t2  (mod  p). 
r  m. 

x  <—  x  •  t  (mod  p). 
b  <—  b  •  y  (mod  p). 

return  x. 


where  a  =  2e  •  a\  and  a\  is  odd.  We  also  have  the  identities,  for  n  odd, 


This  now  gives  us  a  fast  algorithm,  which  does  not  require  factoring  of  integers,  to  determine  the 
Jacobi  symbol,  and  so  the  Legendre  symbol  in  the  case  where  the  denominator  is  prime.  The  only 
factoring  required  is  to  extract  the  even  part  of  a  number.  See  Algorithm  1.4  which  computes  the 
symbol  (|).  As  an  example  we  have 


1. 


1.3.8.  Squares  and  Pseudo-squares  Modulo  a  Composite:  Recall  that  the  Legendre  symbol 


tells  us  whether  a  is  a  square  modulo  p,  for  p  a  prime.  Alas,  the  Jacobi  symbol  does  not 

tell  us  the  whole  story  about  whether  a  is  a  square  modulo  n,  when  n  is  a  composite.  If  a  is  a 
square  modulo  n  then  the  Jacobi  symbol  will  be  equal  to  plus  one,  however  if  the  Jacobi  symbol  is 
equal  to  plus  one  then  it  is  not  always  true  that  a  is  a  square. 

Let  n  >  3  be  odd  and  let  the  set  of  squares  in  (Z/nZ)*  be  denoted  by 

Qn  =  {x2  (mod  n)  :  x  E  (Z/nZ)*}. 


Now  let  Jn  denote  the  set  of  elements  with  Jacobi  symbol  equal  to  plus  one,  i.e. 

Jn  =  (a;  G  (Z/nZ)*  :  (M  =  1  j  . 

The  set  of  pseudo-squares  is  the  difference  Jn\Qn •  There  are  two  important  cases  for  cryptography, 
either  n  is  prime  or  n  is  the  product  of  two  primes: 

•  n  is  a  prime  p: 

*  Qn  —  Jn- 

•  #Qn  =  (n-  l)/2. 
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Algorithm  1.4:  Jacobi  symbol  algorithm 
if  b  <  0  or  b  (mod  2)  =  0  then  return  0. 

3  !• 

if  a  <  0  then 

CL  i —  — CL. 

if  b  (mod  4)  =  3  then  j  i - j. 

while  a  7^  0  do 

while  a  (mod  2)  =  0  do 

CL  i —  dj 2. 

if  b  (mod  8)  =  3  or  b  (mod  8)  =  5  then  j  i - j. 

(a,  6)  i —  (6,  r). 

if  a  (mod  4)  =  3  and  b  (mod  4)  =  3  then  j  i - j. 

a  A-  a  (mod  b). 

if  b  =  1  then  return  j. 
return  0. 


•  n  is  the  product  of  two  primes,  n  =  p  •  q: 

*  Qn  Jri‘ 

•  #Qn  =  #(A  \  Qn)  =  (p  -  1  )(q  -  l)/4- 

The  sets  Qn  and  Jn  will  be  seen  to  be  important  in  a  number  of  algorithms  and  protocols,  especially 
in  the  case  where  n  is  a  product  of  two  primes. 

1.3.9.  Square  Roots  Modulo  n  =  p-q:  We  now  look  at  how  to  compute  a  square  root  modulo  a 
composite  number  n  =  p-q.  Suppose  we  wish  to  compute  the  square  root  of  a  modulo  n.  We  assume 
we  know  p  and  <7,  and  that  a  really  is  a  square  modulo  n,  which  can  be  checked  by  demonstrating 
that 

=  !• 

P)  \qj 

We  first  compute  a  square  root  of  a  modulo  p,  call  this  sp.  Then  we  compute  a  square  root  of  a 
modulo  <7,  call  this  sq.  Finally  to  deduce  the  square  root  modulo  n,  we  apply  the  Chinese  Remainder 
Theorem  to  the  equations 


x  =  sp  (mod  p)  and  x  =  sq  (mod  q). 

modulo  n  is 
221  =  13-17. 

513  —  3  and  517  =  8. 

Applying  the  Chinese  Remainder  Theorem  we  find 

5  =  42 

and  we  can  check  that  5  really  is  a  square  root  by  computing 

52  =  422  =  217  (mod  n). 


Note  that  if  we  do  not  know  the  prime  factors  of  n  then  computing  square  roots 
believed  to  be  a  very  hard  problem;  indeed  it  is  as  hard  as  factoring  n  itself. 

As  an  example  suppose  we  wish  to  compute  the  square  root  of  a  =  217  modulo  n  = 
Now  a  square  root  of  a  modulo  13  and  17  is  given  by 


1.4.  PROBABILITY 


21 


There  are  three  other  square  roots,  since  n  has  two  prime  factors.  These  other  square  roots  are 
obtained  by  applying  the  Chinese  Remainder  Theorem  to  the  other  three  equation  pairs 


513 

=  10, 

517 

513 

=  3, 

517 

513 

=  10, 

517 

8, 

9, 

9, 


Hence,  all  four  square  roots  of  217  modulo  221  are  given  by  42,  94,  127  and  179. 


1.4.  Probability 

At  some  points  we  will  need  a  basic  understanding  of  elementary  probability  theory.  In  this  section 
we  summarize  the  theory  we  require  and  give  a  few  examples.  Most  readers  should  find  this  a 
revision  of  the  type  of  probability  encountered  in  high  school.  A  random  variable  is  a  variable  X 
which  takes  certain  values  with  given  probabilities.  If  X  takes  the  value  5  with  probability  0.01  we 
write  this  as 

p(X  =  s)  =  0.01. 

As  an  example,  let  T  be  the  random  variable  representing  tosses  of  a  fair  coin,  we  then  have  the 
probabilities 

1 

p(T  =  Heads)  =  -, 

2 

1 

p(T  =  Tails)  =  -. 

As  another  example  let  E  be  the  random  variable  representing  letters  in  English  text.  An  analysis 
of  a  large  amount  of  English  text  allows  us  to  approximate  the  relevant  probabilities  by 

p(E  =  a)  =  0.082, 


p(E  =  e)  =  0.127, 


p{E  =  z)  =  0.001. 

Basically  if  A  is  a  discrete  random  variable  on  a  set  A,  and  p(X  =  x)  is  the  probability  distribution , 
i.e.  the  probability  of  a  value  x  being  selected  from  A,  then  we  have  the  two  following  properties: 

p(X  =  x)  >  0  for  all  x  E  A, 

Ep(x  =  x)  =  i- 

xES 

It  is  common  to  illustrate  examples  from  probability  theory  using  a  standard  deck  of  cards.  We 
shall  do  likewise  and  let  V  denote  the  random  variable  that  a  card  is  a  particular  value,  let  S  denote 
the  random  variable  that  a  card  is  a  particular  suit  and  let  C  denote  the  random  variable  of  the 
colour  of  a  card.  So  for  example 

p(C  =  Red)  =  1, 

p(V  =  Ace  of  Clubs)  =  — , 

52 

p(S  =  Clubs)  =  -. 
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Let  X  and  Y  be  two  random  variables,  where  p(X  =  x)  is  the  probability  that  X  takes  the  value 
x  and  p(Y  =  y )  is  the  probability  that  Y  takes  the  value  y.  The  joint  probability  p(X  =  x,  y  =  y) 
is  dehned  as  the  probability  that  X  takes  the  value  x  and  Y  takes  the  value  y.  So  if  we  let  X  —  C 
and  Y  —  S  then  we  have 


p(C  =  Red,  S  =  Club)  =  0, 
p(C  =  Red,  S  =  Hearts)  = 
p(C  =  Black,  5  =  Club)  : 
p(C  =  Black,  S  =  Hearts)  =  0, 


1 

Red,  S  =  Diamonds)  =  - 


1 

4 

1 

4 


p(C  : 

p(C  =  Red,  S  =  Spades)  =  0 


p(C  =  Black,  S  =  Spades) 


p(C  =  Black,  S  =  Diamonds)  =  0, 

1 

4’ 

Two  random  variables  X  and  Y  are  said  to  be  independent  if,  for  all  values  of  x  and  y, 

p(X  =  x,  Y  =  y)  =  p(X  =  x)  •  p(Y  =  y). 

Hence,  the  random  variables  C  and  S  are  not  independent.  As  an  example  of  independent  random 
variables  consider  the  two  random  variables  T\  the  value  of  the  first  toss  of  an  unbiased  coin  and 
T2  the  value  of  a  second  toss  of  the  coin.  Since,  assuming  standard  physical  laws,  the  toss  of  the 
first  coin  does  not  affect  the  outcome  of  the  toss  of  the  second  coin,  we  say  that  T\  and  T2  are 
independent.  This  is  confirmed  by  the  joint  probability  distribution 

p(T1=H1T2  =  H)  =  ^  p(T1  =  H,T2  =  T)  =  ±, 


p(T 1  =  T,  T2 


H)  =  - 

J  4 


p(Ti=T,T2=T) 


1 

4 


1.4.1.  Bayes’  Theorem:  The  conditional  probability  p(X  =  x  \  Y  =  y)  of  two  random  variables 
X  and  Y  is  dehned  as  the  probability  that  X  takes  the  value  x  given  that  Y  takes  the  value  y. 
Returning  to  our  random  variables  based  on  a  pack  of  cards  we  have 


and 


p(S  =  Spades  |  C  =  Red)  =  0 


p(V  =  Ace  of  Spades  |  C  =  Black)  = 


1 
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The  first  follows  since  if  we  know  that  a  card  is  red,  then  the  probability  that  it  is  a  spade  is  zero, 
since  a  red  card  cannot  be  a  spade.  The  second  follows  since  if  we  know  a  card  is  black  then  we 
have  restricted  the  set  of  cards  to  half  the  pack,  one  of  which  is  the  ace  of  spades. 

The  following  is  one  of  the  most  crucial  statements  in  probability  theory,  which  you  should 
recall  from  high  school, 

Theorem  1.7  (Bayes’  Theorem).  If  p(Y  =  y)  >  0  then 

p(X  =  x)  •  p(Y  =  y  |  X  =  x) 


p(X  =  x\Y  =  y)  = 


p(Y  =  y) 
p(X  =  x,Y  =  y) 


p(Y  =  y) 

We  can  apply  Bayes’  Theorem  to  our  examples  above  as  follows 

p(S  =  Spades,  C  =  Red) 


p(S  =  Spades  |  C  =  Red) 


=  0 
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p(V  =  Ace  of  Spades  |  C  =  Black) 


p(V  =  Ace  of  Spades,  C  =  Black) 


p{C  =  Black) 


1  f1 

52  {2 

2  _  1 
52  _  26 


If  X  and  Y  are  independent  then  we  have 


p(X  =  x\Y  =  y)  =  p(X  =  x), 

i.e.  the  value  that  X  takes  does  not  depend  on  the  value  that  Y  takes.  An  identity  which  we  will 
use  a  lot  is  the  following,  for  events  A  and  B 

p(A)  =  p(A,  B)  +  p(A,  -iB) 

=  p(A\B)  •  p(B)  +  p(A\^B)  •  p(-iB). 

where  -i B  is  the  event  that  B  does  not  happen. 


1.4.2.  Birthday  Paradox:  Another  useful  result  from  elementary  probability  theory  that  we  will 
require  is  the  birthday  paradox.  Suppose  a  bag  has  nn  balls  in  it,  all  of  different  colours.  We  draw 
one  ball  at  a  time  from  the  bag  and  write  down  its  colour,  we  then  replace  the  ball  in  the  bag  and 
draw  again.  If  we  define 


rn 


m  •  (rn  —  1)  •  (rn  —  2)  •  •  •  (m  —  n  +  1) 


then  the  probability,  after  n  balls  have  been  taken  out  of  the  bag,  that  we  have  obtained  at  least 
one  matching  colour  (or  coincidence)  is 


mn 


As  m  becomes  larger  the  expected  number  of  balls  we  have  to  draw  before  we  obtain  the  first 
coincidence  is 


To  see  why  this  is  called  the  birthday  paradox  consider  the  probability  of  two  people  in  a  room 
sharing  the  same  birthday.  Most  people  initially  think  that  this  probability  should  be  quite  low, 
since  they  are  thinking  of  the  probability  that  someone  in  the  room  shares  the  same  birthday  as 
them.  One  can  now  easily  compute  that  the  probability  of  at  least  two  people  in  a  room  of  23 
people  having  the  same  birthday  is 


1 


365(23) 

36523 


«  0.507. 


In  fact  this  probability  increases  quite  quickly  since  in  a  room  of  30  people  we  obtain  a  probability 
of  approximately  0.706,  and  in  a  room  of  100  people  we  obtain  a  probability  of  over  0.999  999  6. 

In  many  situations  in  cryptography  we  use  the  birthday  paradox  in  the  following  way.  We  are 
given  a  random  process  which  outputs  elements  from  a  set  of  size  m,  just  like  the  balls  above.  We 
run  the  process  for  n  steps,  again  just  like  above.  But  instead  of  wanting  to  know  how  many  times 
we  need  to  execute  the  process  to  find  a  collision  we  instead  want  to  know  an  upper  bound  on  the 
probability  of  finding  a  collision  after  n  steps  (think  of  n  being  much  smaller  than  m).  This  is  easy 
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to  estimate  due  to  the  following  inequalities: 


Pr  [At  least  one  repetition  in  pulling  n  elements  from  m 

<  Pr[Item  i  collides  with  item  j] 


TV 


2  •  nn 


1.5.  Big  Numbers 

At  various  points  we  need  to  discuss  how  big  a  number  can  be  before  it  is  impossible  for  someone 
to  perform  that  many  operations.  Such  big  numbers  are  used  in  cryptography  to  measure  the  work 
effort  of  the  adversary.  Suppose  we  had  a  (mythical)  computer  which  could  do  one  trillion  “basic” 
operations  per  second.  Note  that  a  modern  3  GHz  computer  with  eight  “cores”  can  only  do  24 
billion  operations  per  second,  so  our  mythical  computer  is  around  42  times  faster  than  a  current 
desktop  computer. 

Suppose  we  had  an  algorithm  which  took  2tj  “basic”  operations.  We  want  to  know  how  long 
our  mythical  computer  would  take  to  perform  these  2*  operations.  Now  one  trillion  is  about  240. 
Thus  to  perform  264  operations  would  require  264-40  =  224  seconds,  or  194  days.  Given  that 
finding  194  computers  is  not  very  hard,  a  calculation  which  takes  264  basic  operations  could  be 
performed  by  someone  with  just  under  200  computers  in  under  a  day.  An  algorithm  which  took 
280  “basic”  operations  would  take  240  seconds  for  our  mythical  computer,  or  nearly  34900  years. 
Thus  a  large  government-funded  laboratory  which  could  afford  perhaps  15  000  mythical  computers 
could  perform  the  algorithm  requiring  280  operations  in  about  two  years.  This  might  be  expensive, 
but  if  national  security  depended  on  it,  then  a  computation  of  280  operations  would  be  plausible. 

However,  when  we  go  to  an  algorithm  which  requires  2128  operations  then  our  mythical  computer 
would  require  288  seconds  or  9  quintillion  years  (i.e.  9T018  years).  Note,  the  universe  is  only  believed 
to  be  13.8  billion  years  old.  Thus  a  computation  which  required  9  quintillion  years  is  essentially 
impossible,  ever!!!! 

To  get  an  idea  of  how  big  these  numbers  are  consider  that  280  is  a  number  with  24  decimal 
digits,  whereas  2128  is  a  number  with  38  decimal  digits.  These  are  both  significantly  more  than  the 
number  of  cells  in  the  human  body  (which  is  around  1014),  or  the  number  of  stars  in  the  observable 
universe  (which  is  around  1022). 


Chapter  Summary 

•  A  group  is  a  set  with  an  operation  which  has  an  identity,  is  associative  and  in  which  every 
element  has  an  inverse. 

•  Addition  and  multiplication  in  modular  arithmetic  both  provide  examples  of  groups. 

•  For  modular  multiplication  we  need  to  be  careful  which  set  of  numbers  we  take  when 
defining  such  a  group,  as  not  all  integers  modulo  m  are  invertible  with  respect  to  multi¬ 
plication. 
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•  A  ring  is  a  set  with  two  operations  which  behaves  like  the  set  of  integers  under  addition 
and  multiplication.  Modular  arithmetic  is  an  example  of  a  ring. 

•  A  held  is  a  ring  in  which  all  non-zero  elements  have  a  multiplicative  inverse.  The  integers 
modulo  a  prime  is  an  example  of  a  held. 

•  Multiplicative  inverses  for  modular  arithmetic  can  be  found  using  the  extended  Euclidean 
algorithm. 

•  Sets  of  simultaneous  linear  modular  equations  can  be  solved  using  the  Chinese  Remainder 
Theorem. 

•  Square  elements  modulo  a  prime  can  be  detected  using  the  Legendre  symbol;  square  roots 
can  be  efficiently  computed  using  Shanks’  Algorithm. 

•  Square  elements  and  square  roots  modulo  a  composite  can  be  determined  efficiently  as 
long  as  one  knows  the  factorization  of  the  modulus. 

•  Bayes’  Theorem  allows  us  to  compute  conditional  probabilities. 

•  The  birthday  paradox  allows  us  to  estimate  how  quickly  collisions  occur  when  one  repeat¬ 
edly  samples  from  a  hnite  space. 

•  We  also  discussed  how  big  various  numbers  are,  as  a  means  to  work  out  what  is  a  feasible 
computation. 


Further  Reading 

Bach  and  Shallit  is  the  best  introductory  book  I  know  of  which  deals  with  Euclid’s  algorithm 
and  hnite  helds.  It  contains  a  lot  of  historical  information,  plus  excellent  pointers  to  the  relevant 
research  literature.  Whilst  aimed  in  some  respects  at  Computer  Scientists,  Bach  and  Shallit ’s  book 
may  be  a  little  too  mathematical  for  some.  For  a  more  traditional  introduction  to  the  basic  discrete 
mathematics  we  shall  need,  see  the  books  by  Biggs  or  Rosen. 

E.  Bach  and  J.  Shallit.  Algorithmic  Number  Theory.  Volume  1:  Efficient  Algorithms.  MIT  Press, 
1996. 

N.L.  Biggs.  Discrete  Mathematics.  Oxford  University  Press,  1989. 

K.H.  Rosen.  Discrete  Mathematics  and  Its  Applications.  McGraw-Hill,  1999. 


CHAPTER  2 


Primality  Testing  and  Factoring 


Chapter  Goals 


•  To  explain  the  basics  of  primality  testing. 

•  To  describe  the  most  used  primality-testing  algorithm,  namely  Miller-Rabin. 

•  To  examine  the  relationship  between  various  mathematical  problems  based  on  factoring. 

•  To  explain  various  factoring  algorithms. 

•  To  sketch  how  the  most  successful  factoring  algorithm  works,  namely  the  Number  Field 
Sieve. 


2.1.  Prime  Numbers 

The  generation  of  prime  numbers  is  needed  for  almost  all  public  key  algorithms,  for  example 

•  In  the  RSA  encryption  or  the  Rabin  encryption  system  we  need  to  find  primes  p  and  q  to 
compute  the  public  key  N  =  p  •  q. 

•  In  ElGamal  encryption  we  need  to  find  primes  p  and  q  with  q  dividing  p  —  1. 

•  In  the  elliptic  curve  variant  of  ElGamal  we  require  an  elliptic  curve  over  a  finite  held,  such 
that  the  order  of  the  elliptic  curve  is  divisible  by  a  large  prime  q. 

Luckily  we  shall  see  that  testing  a  number  for  primality  can  be  done  very  fast  using  very  simple 
code,  but  with  an  algorithm  that  has  a  probability  of  error.  By  repeating  this  algorithm  we  can 
reduce  the  error  probability  to  any  value  that  we  require. 

Some  of  the  more  advanced  primality-testing  techniques  will  produce  a  certificate  which  can 
be  checked  by  a  third  party  to  prove  that  the  number  is  indeed  prime.  Clearly  one  requirement 
of  such  a  certificate  is  that  it  should  be  quicker  to  verify  than  it  is  to  generate.  Such  a  primality- 
testing  routine  will  be  called  a  primality-proving  algorithm,  and  the  certificate  will  be  called  a  proof 
of  primality.  However,  the  main  primality-testing  algorithm  used  in  cryptographic  systems  only 
produces  certificates  of  compositeness  and  not  certificates  of  primality. 

For  many  years  this  was  the  best  that  we  could  do;  i.e.  either  we  could  use  a  test  which  had  a 
small  chance  of  error,  or  we  spent  a  lot  of  time  producing  a  proof  of  primality  which  could  be  checked 
quickly.  However,  in  2002  Agrawal,  Kayal  and  Saxena  presented  a  deterministic  polynomial-time 
primality  test  thus  showing  that  the  problem  of  determining  whether  a  number  was  prime  was 
in  the  complexity  class  V.  However,  the  so-called  AKS  Algorithm  is  not  used  in  practice  as  the 
algorithms  which  have  a  small  error  are  more  efficient  and  the  error  can  be  made  vanishingly  small 
at  little  extra  cost. 

2.1.1.  The  Prime  Number  Theorem:  Before  discussing  these  algorithms,  we  need  to  look  at 
some  basic  heuristics  concerning  prime  numbers.  A  famous  result  in  mathematics,  conjectured  by 
Gauss  after  extensive  calculation  in  the  early  1800s,  is  the  Prime  Number  Theorem: 
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Theorem  2.1  (Prime  Number  Theorem).  The  function  i t(X)  counts  the  number  of  primes  less 
than  X ,  where  we  have  the  approximation 


7r(X)  « 


logX' 


This  means  primes  are  quite  common.  For  example,  the  number  of  primes  less  than  21024  is  about 
21014.  The  Prime  Number  Theorem  also  allows  us  to  estimate  the  probability  of  a  random  number 
being  prime:  if  p  is  a  number  chosen  at  random  then  the  probability  it  is  prime  is  about 

1 

logp 

So  a  random  number  p  of  1024  bits  in  length  will  be  a  prime  with  probability 

1  1 

_  rvj  _ 

logp  709 

So  on  average  we  need  to  select  354  odd  numbers  of  size  2i°24  before  we  find  one  which  is  prime. 
Hence,  it  is  practical  to  generate  large  primes,  as  long  as  we  can  test  primality  efficiently. 


2.1.2.  Trial  Division:  The  naive  test  for  testing  a  number  p  to  be  prime  is  one  of  trial  division. 
We  essentially  take  all  numbers  between  2  and  yjp  and  see  whether  one  of  them  divides  p,  if  not 
then  p  is  prime.  If  such  a  number  does  divide  p  then  we  obtain  the  added  bonus  of  finding  a  factor 
of  the  composite  number  p.  Hence,  trial  division  has  the  advantage  (compared  with  more  advanced 
primality-testing/proving  algorithms)  that  it  either  determines  that  p  is  a  prime,  or  determines  a 
non-trivial  factor  of  p. 

However,  primality  testing  by  using  trial  division  is  a  terrible  strategy.  In  the  worst  case,  when 
p  is  a  prime,  the  algorithm  requires  Xp  steps  to  run,  which  is  an  exponential  function  in  terms  of 
the  size  of  the  input  to  the  problem.  Another  drawback  is  that  it  does  not  produce  a  certificate 
for  the  primality  of  p,  in  the  case  when  the  input  p  is  prime.  When  p  is  not  prime  it  produces  a 
certificate  which  can  easily  be  checked  to  prove  that  p  is  composite,  namely  a  non-trivial  factor  of 
p.  But  when  p  is  prime  the  only  way  we  can  verify  this  fact  again  (say  to  convince  a  third  party) 
is  to  repeat  the  algorithm  once  more. 

Despite  its  drawbacks,  however,  trial  division  is  the  method  of  choice  for  numbers  which  are  very 
small.  In  addition,  partial  trial  division  up  to  a  bound  Y  is  able  to  eliminate  all  but  a  proportion 


of  all  composites.  This  method  of  eliminating  composites  is  very  old  and  is  called  the  Sieve  of 
Eratosthenes.  Naively  this  is  what  we  would  always  do,  since  we  would  never  check  an  even 
number  greater  than  two  for  primality,  since  it  is  obviously  composite.  Hence,  many  primality- 
testing  algorithms  first  do  trial  division  with  all  primes  up  to  say  100,  so  as  to  eliminate  all  but  the 
proportion 

n  (AN11'2 

p<100  v 

of  composites. 


2.1.3.  Fermat’s  Test:  Most  advanced  probabilistic  algorithms  for  testing  primality  make  use  of 
the  converse  to  Fermat’s  Little  Theorem.  Recall  Lagrange’s  Theorem  from  Chapter  1;  this  said 
that  if  G  is  a  multiplicative  group  of  size  then 

a#G  =  1 
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for  all  values  a  E  G.  So  if  G  is  the  group  of  integers  modulo  n  under  multiplication  then 

a0(n)  —  i  (mod  n) 

for  all  a  E  (Z/71Z)*.  Fermat’s  Little  Theorem  is  the  case  where  n  =  p  is  prime,  in  which  case  the 
above  equality  becomes 

aP-i  _  i  (mod  p). 

So  if  n  is  prime  we  have  that 

an_1  =  1  (mod  7i) 

always  holds,  whilst  if  n  is  not  prime  then  we  have  that 

an~l  =  1  (mod  n) 

is  “unlikely”  to  hold. 

Since  computing  an_1  (mod  n)  is  a  very  fast  operation  (see  Chapter  6)  this  gives  us  a  very  fast 
test  for  compositeness  called  the  Fermat  Test  to  the  base  a.  Running  the  Fermat  Test  can  only 
convince  us  of  the  compositeness  of  n.  It  can  never  prove  to  us  that  a  number  is  prime,  only  that 
it  is  not  prime. 

To  see  why  it  does  not  prove  primality  consider  the  case  n  =  11  •  31  =  341  and  the  base  a  —  2: 
we  have 

an~l  =  2340  =  1  (mod  341) 

but  n  is  clearly  not  prime.  In  such  a  case  we  say  that  n  is  a  (Fermat)  pseudo-prime  to  the  base  2. 
There  are  infinitely  many  pseudo-primes  to  any  given  base.  It  can  be  shown  that  if  n  is  composite 
then,  with  probability  greater  than  1/2,  we  obtain 

an_1  7^  1  (mod  n). 

This  gives  us  Algorithm  2.1  to  test  n  for  primality.  If  Algorithm  2.1  outputs  (Composite,  a)  then 


Algorithm  2.1:  Fermat’s  test  for  primality 

for  i  =  0  to  k  —  1  do 

Pick  a  E  [2, ..., n  —  1]. 
b  an~ 1  mod  n. 

if  b  7^  1  then  return  (Composite,  a), 
return  “Probably  Prime” . 


we  know 

•  n  is  definitely  a  composite  number, 

•  a  is  a  witness  for  this  compositeness,  in  that  we  can  verify  that  n  is  composite  by  using 
the  value  of  a. 

If  the  above  algorithm  outputs  “Probably  Prime”  then 

•  n  is  a  composite  with  probability  at  most  l/2fc, 

•  7i  is  either  a  prime  or  a  so-called  probable  prime. 

For  example  if  we  take 

7i  =  43  040  357, 

then  7i  is  a  composite,  with  one  witness  given  by  a  =  2  since 

2"'1  (mod  n)  =  9  888  212. 

As  another  example  take 

n  =  2192  -  264  -  1, 
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then  the  algorithm  outputs  “Probably  Prime”  since  we  cannot  find  a  witness  for  compositeness. 
Actually  this  n  is  a  prime,  so  it  is  not  surprising  we  did  not  find  a  witness  for  compositeness! 
However,  there  are  composite  numbers  for  which  the  Fermat  Test  will  always  output 

“Probably  Prime” 

for  every  a  coprime  to  n.  These  numbers  are  called  Carmichael  numbers,  and  to  make  things  worse 
there  are  infinitely  many  of  them.  The  first  three  are  561,1105  and  1729.  Carmichael  numbers 
have  the  following  properties 

•  They  are  always  odd. 

•  They  have  at  least  three  prime  factors. 

•  They  are  square  free. 

•  If  p  divides  a  Carmichael  number  IV,  then  p  —  1  divides  N  —  1. 

To  give  you  some  idea  of  their  density,  if  we  look  at  all  numbers  less  than  10 16  then  there  are  about 
2.7  •  1014  primes  in  this  region,  but  only  246  683  ~  2.4- 105  Carmichael  numbers.  Hence,  Carmichael 
numbers  are  rare,  but  not  rare  enough  to  be  ignored  completely. 

2.1.4.  Miller— Rabin  Test:  Due  to  the  existence  of  Carmichael  numbers  the  Fermat  Test  is  usu¬ 
ally  avoided.  However,  there  is  a  modification  of  the  Fermat  Test,  called  the  Miller-Rabin  Test, 
which  avoids  the  problem  of  composites  for  which  no  witness  exists.  This  does  not  mean  it  is  easy 
to  find  a  witness  for  each  composite,  it  only  means  that  a  witness  must  exist.  In  addition  the 
Miller-Rabin  Test  has  probability  of  1/4  of  accepting  a  composite  as  prime  for  each  random  base 
a,  so  again  repeated  application  of  the  algorithm  leads  us  to  reduce  the  error  probability  down  to 
any  value  we  care  to  mention. 

The  Miller-Rabin  Test  is  given  by  the  pseudo-code  in  Algorithm  2.2.  We  do  not  show  that  the 
Miller-Rabin  Test  works.  If  you  are  interested  in  the  reason  see  any  book  on  algorithmic  number 
theory  for  the  details,  for  example  that  by  Cohen  or  Bach  and  Shallit  mentioned  in  the  Further 
Reading  section  of  this  chapter.  Just  as  with  the  Fermat  Test,  we  repeat  the  method  k  times  with 
k  different  bases,  to  obtain  an  error  probability  of  l/4k  if  the  algorithm  always  returns  “Probably 
Prime” .  Hence,  we  expect  that  the  Miller-Rabin  Test  will  output  “Probably  Prime”  for  values  of 
k  >  20  only  when  n  is  actually  a  prime. 


Algorithm  2.2:  Miller-Rabin  algorithm 

Write  n  —  1  =  2s  •  m,  with  m  odd. 
for  j  =  0  to  k  —  1  do 

Pick  a  G  [2, ...,  n  —  2]. 
b  <—  a 171  mod  n. 
if  b  7^  1  and  b  7^  (n  —  1)  then 
i  i —  1 . 

while  i  <  s  and  b  7^  (n  —  1)  do 
b  <—  b2  mod  n. 

if  b  =  1  then  return  (Composite,  a). 
i  i —  i  T  1. 

if  b  7^  (n  —  1)  then  return  (Composite,  a), 
return  “Probable  Prime”. 


If  n  is  a  composite  then  the  value  of  a  output  by  Algorithm  2.2  is  called  a  Miller-Rabin  witness 
for  the  compositeness  of  n,  and  under  the  Generalized  Riemann  Hypothesis  (GRH),  a  conjecture 
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believed  to  be  true  by  most  mathematicians,  there  is  always  a  Miller-Rabin  witness  a  for  the 
compositeness  of  n  with 

a  <  0((logn)2). 

2.1.5.  Primality  Proofs:  Up  to  now  we  have  only  output  witnesses  for  compositeness,  and  we 
can  interpret  such  a  witness  as  a  proof  of  compositeness.  In  addition  we  have  only  obtained  probable 
primes,  rather  than  numbers  which  are  one  hundred  percent  guaranteed  to  be  prime.  In  practice 
this  seems  to  be  all  right,  since  the  probability  of  a  composite  number  passing  the  Miller-Rabin 
Test  for  twenty  bases  is  around  2-40  which  should  never  really  occur  in  practice.  But  theoretically 
(and  maybe  in  practice  if  we  are  totally  paranoid)  this  could  be  a  problem.  In  other  words  we  may 
want  real  primes  and  not  just  probable  ones. 

There  are  algorithms  whose  output  is  a  witness  for  the  primality  of  the  number.  Such  a  witness 
is  called  a  proof  of  primality.  In  practice  such  programs  are  only  used  when  we  are  morally  certain 
that  the  number  we  are  testing  for  primality  is  actually  prime.  In  other  words  the  number  has 
already  passed  the  Miller-Rabin  Test  for  a  number  of  bases  and  all  we  now  require  is  a  proof  of 
the  primality. 

The  most  successful  of  these  primality-proving  algorithms  is  one  based  on  elliptic  curves  called 
ECPP  (for  Elliptic  Curve  Primality  Prover).  This  itself  is  based  on  an  older  primality-proving 
algorithm  based  on  finite  fields  due  to  Pocklington  and  Lehmer;  the  elliptic  curve  variant  is  due 
to  Goldwasser  and  Kilian.  The  ECPP  algorithm  is  a  randomized  algorithm  which  is  not  mathe¬ 
matically  guaranteed  to  always  produce  an  output,  i.e.  a  witness,  even  when  the  input  is  a  prime 
number.  If  the  input  is  composite  then  the  algorithm  is  not  guaranteed  to  terminate  at  all.  Al¬ 
though  ECPP  runs  in  expected  polynomial  time,  i.e.  it  is  quite  efficient,  the  proofs  of  primality  it 
produces  can  be  deterministically  verified  even  faster. 

There  is  an  algorithm  due  to  Adleman  and  Huang  which,  unlike  the  ECPP  method,  is  guar¬ 
anteed  to  terminate  with  a  proof  of  primality  on  input  of  a  prime  number.  It  is  based  on  a 
generalization  of  elliptic  curves  called  hyperelliptic  curves  and  has  never  (to  my  knowledge)  been 
implemented.  The  fact  that  it  has  never  been  implemented  is  not  only  due  to  the  far  more  com¬ 
plicated  mathematics  involved,  but  is  also  due  to  the  fact  that  while  the  hyperelliptic  variant  is 
mathematically  guaranteed  to  produce  a  proof,  the  ECPP  method  will  always  do  so  in  practice  for 
less  work  effort. 

2.1.6.  AKS  Algorithm:  The  Miller-Rabin  Test  is  a  randomized  primality-testing  algorithm 
which  runs  in  polynomial  time.  It  can  be  made  into  a  deterministic  polynomial-time  algorithm,  but 
only  on  the  assumption  that  the  Generalized  Riemann  Hypothesis  is  true.  The  ECPP  algorithm 
and  its  variants  are  randomized  algorithms  and  are  expected  to  have  polynomial-time  run-bounds, 
but  we  cannot  prove  they  do  so  on  all  inputs.  Thus  for  many  years  it  was  an  open  question  whether 
we  could  create  a  primality-testing  algorithm  which  ran  in  deterministic  polynomial  time,  and  prov- 
ably  so  on  all  inputs  without  needing  to  assume  any  conjectures.  In  other  words,  the  question  was 
whether  the  problem  PRIMES  is  in  complexity  class  V? 

In  2002  this  was  answered  in  the  affirmative  by  Agrawal,  Kayal,  and  Saxena.  The  test  they 
developed,  now  called  the  AKS  Primality  Test,  makes  use  of  the  following  generalization  of  Fermat’s 
test.  In  the  theorem  we  are  asking  whether  two  polynomials  of  degree  n  are  the  same.  Taking  this 
basic  theorem,  which  is  relatively  easy  to  prove,  and  turning  it  into  a  polynomial-time  test  was  a 
major  breakthrough.  The  algorithm  itself  is  given  in  Algorithm  2.3.  In  the  algorithm  we  use  the 
notation  F(X)  (mod  G(X),n)  to  denote  taking  the  reduction  of  F{X)  modulo  both  G(X)  and  n. 

Theorem  2.2.  An  integer  n  >  2  is  prime  if  and  only  if  the  relation 

(. X  —  a)n  =  ( Xn  —  a)  (mod  n) 

holds  for  some  integer  a  coprime  to  n ;  or  indeed  all  integers  a  coprime  to  n. 
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Algorithm  2.3:  AKS  primality-testing  algorithm 

if  n  =  ab  for  some  integers  a  and  b  then  return  “Composite” . 

Find  the  smallest  r  such  that  the  order  of  n  modulo  r  is  greater  than  (logn)2. 
if  3a  <  r  such  that  1  <  gcd(a, n)  <  n  then  return  “Composite”. 

if  n  <  r  then  return  “Prime” . 

for  a  =  1  to  LaAKO  '  l°g(n)J  do 

if  ( X  +  a)n  7^  Xn  +  a  (mod  Xr  —  1,  n)  then  return  “Composite” 

return  Prime 


2.2.  The  Factoring  and  Factoring- Related  Problems 

The  most  important  one-way  function  used  in  public  key  cryptography  is  that  of  factoring  integers. 
By  factoring  an  integer  we  mean  finding  its  prime  factors,  for  example 

10  =  2-5, 

60  =  22  •  3  •  5, 

2113  -  1  =  3391  •  23  279  •  65  993  •  1 868  569  •  1 066  818 132  868  207. 

There  are  a  number  of  other  hard  problems  related  to  factoring  which  can  be  used  to  produce 
public  key  cryptosystems.  Suppose  you  are  given  an  integer  N ,  which  is  known  to  be  the  product 
of  two  large  primes,  but  not  its  factors  p  and  q.  There  are  four  main  problems  which  we  can  try  to 
solve: 

•  FACTOR:  Find  p  and  q. 

•  RSA:  Given  e  such  that 

gcd  (e,  (p  —  1  )(q-  1))  =  1 

and  c,  find  m  such  that 

me  =  c  (mod  N). 

•  SQRROOT:  Given  a  such  that 

a  =  x2  (mod  A), 

find  x. 

•  QUADRES:  Given  a  G  A,  determine  whether  a  is  a  square  modulo  N. 


p,q  <—  {u/2-bit  primes} 

N  p  •  q  - ► 

p',q'  4 - 

Win  if  p'  •  q'  =  N 

and  p7,  q'  ^  N  _ 

Figure  2.1.  Security  game  to  define  the  FACTOR  problem 

In  Chapter  11,  we  use  so-called  security  games  to  define  security  for  cryptographic  components. 
These  are  abstract  games  played  between  an  adversary  and  a  challenger.  The  idea  is  that  the 
adversary  needs  to  achieve  some  objective  given  only  the  data  provided  by  the  challenger.  Such 
games  tend  to  be  best  described  using  pictures,  where  the  challenger  (or  environment)  is  listed 
on  the  outside  and  the  adversary  is  presented  as  a  box.  The  reason  for  using  such  diagrams  will 
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become  clearer  later  when  we  consider  security  proofs,  but  for  now  they  are  simply  going  to  be 
used  to  present  security  definitions. 

p,  q  <—  {n/2-bit  primes}  _ 

TV  i—  p  •  q 

e,  d  <—  Z  s.t.  e  •  d  =  1  (mod  0(TV)) 
y  <-  (Z/NZy  A 

N,e,y  - ► 

x  •« - 

Win  if  xe  =  y  (mod  TV)  _ 

Figure  2.2.  Security  game  to  define  the  RSA  problem 

So  for  example,  we  could  imagine  a  game  which  defines  the  problem  of  an  adversary  A  trying 
to  factor  a  challenge  number  TV  as  in  Figure  2.1.  The  challenger  comes  up  with  two  secret  prime 
numbers,  multiplies  them  together  and  sends  the  product  to  the  adversary.  The  adversary’s  goal 
is  to  find  the  original  prime  numbers.  Similarly  we  can  define  games  for  the  RSA  and  SQRROOT 
problems,  which  we  give  in  Figures  2.2  and  2.3. 


p,  q  <—  {n/2-bit  primes} 
TV  <—  p  •  q 
a  <—  Qjsi 

TV,  a  - 

x  -* - 

Win  if  x 2  (mod  TV)  =  a 


► 


A 


Figure  2.3.  Security  game  to  define  the  SQRROOT  problem 

In  all  these  games  we  define  the  advantage  of  a  specific  adversary  A  to  be  a  function  of  the 
time  t  which  the  adversary  spends  trying  to  solve  the  input  problem.  For  the  Factoring,  RSA  and 
SQRROOT  games  it  is  defined  as  the  probability  (defined  over  the  random  choices  made  by  A) 
that  the  adversary  wins  the  game  given  that  it  runs  in  time  bounded  by  t  (we  are  not  precise  on 
what  units  t  is  measured  in).  We  write 

Adv^(A,  t)  =  Pr[A  wins  the  game  X  for  v  =  log2  TV  in  time  less  than  t}. 

If  the  adversary  is  always  successful  then  the  advantage  will  be  one,  if  the  adversary  is  never 
successful  then  the  advantage  will  be  zero. 

In  the  next  section  we  will  see  that  there  is  a  trivial  algorithm  which  always  factors  a  number 
in  time  a/TV.  So  we  know  that  there  is  an  adversary  A  such  that 

AdvDCT0R(T  2"/2)  =  1. 

However  if  t  is  any  polynomial  function  pi  of  v  =  log2  TV  then  we  expect  that  there  is  no  efficient 
adversary  A,  and  hence  for  such  t  we  will  have 

AdvDCT0R(A,pi(w))  <  —A, 

P2\V) 

for  any  polynomial  P2(x)  and  for  all  adversaries  A.  A  function  which  grows  less  quickly  than 
1  / P2  (x)  for  any  polynomial  function  of  P2(x)  is  said  to  be  negligible ,  so  we  say  the  advantage  of 
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solving  the  factoring  problem  is  negligible.  Note  that,  even  if  the  game  was  played  again  and  again 
(but  a  polynomial  in  v  number  of  times),  the  adversary  would  still  obtain  a  negligible  probability 
of  winning  since  a  negligible  function  multiplied  by  a  polynomial  function  is  still  negligible. 

In  the  rest  of  this  book  we  will  drop  the  time  parameter  from  the  advantage  statement  and 
implicitly  assume  that  all  adversaries  run  in  polynomial  time;  thus  we  simply  write  Advy(A), 
AdvRACT0R(A),  AdvRSA(A)  and  Adv^RR00T(A).  We  call  the  subscript  the  problem  class;  in  the 
above  this  is  the  size  v  of  the  composite  integers,  in  Chapter  3  it  will  be  the  underlying  abelian 
group.  The  superscript  defines  the  precise  game  which  the  adversary  A  is  playing. 

A  game  X  for  a  problem  class  Y  is  said  to  be  hard  if  the  advantage  is  a  negligible  function  for 
all  polynomial-time  adversaries  A.  The  problem  with  this  definition  is  that  the  notion  of  negligible 
is  asymptotic,  and  when  we  consider  cryptosystems  we  usually  talk  about  concrete  parameters;  for 
example  the  fixed  size  of  integers  which  are  to  be  factored. 

Thus,  instead,  we  will  deem  a  class  of  problems  Y  to  be  hard  if  for  all  polynomial- time  ad¬ 
versaries  A ,  the  advantage  Advy(A)  is  a  very  small  value  e;  think  of  e  as  being  1/2128  or  some 
such  number.  This  means  that  even  if  the  run  time  of  the  adversary  was  one  time  unit,  and  we 
repeatedly  ran  the  adversary  a  large  number  of  times,  the  advantage  that  the  adversary  would  gain 
would  still  be  very  very  small.  In  this  chapter  we  leave  aside  the  issue  of  how  small  “small”  is,  but 
in  later  chapters  we  examine  this  in  more  detail. 

The  QUADRES  problem  is  a  little  different  as  we  need  to  define  the  probability  distribution 
from  which  the  challenge  numbers  a  come.  The  standard  definition  is  for  the  challenger  to  pick  a 
to  be  a  quadratic  residue  with  probability  1/2.  In  this  way  the  adversary  has  a  fifty-fifty  chance  of 
simply  guessing  whether  a  is  a  quadratic  residue  or  not.  We  present  the  game  in  Figure  2.4. 


p,q  <—  {u/2-bit  primes} 

N  <—  p  •  q 

If  b  =  0  then  a  <—  Qn 
If  b  =  1  then  a  <—  Jn  \Qn 
N,  a  - 


b'  - - 

Win  if  b  =  b' 


Figure  2.4.  Security  game  to  define  the  QUADRES  problem 

When  defining  the  advantage  for  the  QUADRES  problem  we  need  to  be  a  bit  careful,  as  the 
adversary  can  always  win  with  probability  one  half  by  simply  just  guessing  the  bit  b  at  random. 
Instead  of  using  the  above  definition  of  advantage  (i.e.  the  probability  that  the  adversary  wins  the 
game),  we  use  the  definition 


AdyQUADRES^)  =  2 . 


Pr[A  wins  the  QUADRES  game  for  v  =  log2  N] - 

2 


Notice  that,  with  this  definition,  if  the  adversary  just  guesses  the  bit  with  probability  1/2  then  its 
advantage  is  zero  as  we  would  expect,  since  2  *  1 1/2  —  1/2 1  =  0.  If  however  the  adversary  is  always 
right,  or  indeed  always  wrong,  then  the  advantage  is  one,  since  2  *  1 1  —  1/2 1  =  2  *  1 0  —  1/2 1  =  1.  Thus 
the  advantage  is  normalized  to  he  between  zero  and  one,  like  in  the  earlier  games,  with  one  being 
always  successful  and  zero  being  no  better  than  random. 

We  call  this  type  of  game  a  decision  game  as  the  adversary  needs  to  decide  which  situation  it 
is  being  placed  in.  We  can  formulate  the  advantage  statement  for  decision  games  in  another  way, 
as  the  following  lemma  explains. 
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Lemma  2.3.  Let  A  be  an  adversary  in  the  QUADRES  game.  Then,  if  b'  is  the  bit  chosen  by  A 
and  b  is  the  bit  chosen  by  the  challenger  in  the  game,  we  have 

AdvQUADRES(J4)  =  |Pr[V  =  l| b  =  1]  -  Pr[6'  =  1|6  =  0]| . 


Proof.  The  proof  is  a  straightforward  application  of  definitions  of  probabilities: 


AdvQUADRES(74)  =  2 


=  2 


=  2 


=  2 


1 

Pr[A  wins] - 

2 


Pr  [b'  =  1  and  6=1]+  Pr  [b'  =  0  and  6  =  0]  — 


1 


Pr[67  =  1 1 6  =  1]  •  Pr[6  =  1]  +  Pr[67  =  0 1 6  =  0]  •  Pr[6  =  0]  — 


Pr[i/  =  1|6  =  1]  •  -  +  Pr[6'  =  0\b  =  0] - 

2  2  2 


1 

2 


Pr[67  =  1 1 6  =  1]  +  Pr[67  =  0 1 6  =  0]  —  1 


Pr[67  =  1 1 6  =  1]  +  (l  —  Pr[67  =  1 1 6  =  0])  —  1 


Pr[67  =  1 1 6  =  1]  —  Pr[67  =  1 1 6  =  0] 


□ 


To  see  how  this  Lemma  works  consider  the  case  when  A  is  a  perfect  adversary,  i.e.  it  wins  the 
QUADRES  game  all  the  time.  In  this  case  we  have  Pr[A  wins]  =  1,  and  the  advantage  is  equal 
to  2  *  1 1  —  1/2 1  =  1  by  definition.  However,  in  this  case  we  also  have  Pr[67  =  1 1 6  =  1]  =  1  and 
Pr[67  =  1 1 6  =  0]  =  0.  Hence,  the  formula  from  the  Lemma  holds.  Now  examine  what  happens 
when  A  just  returns  a  random  result.  We  obtain  Pr[A  wins]  =  1/2,  and  the  advantage  is  equal  to 
2*  |l/2  —  1/2|  =  0.  The  Lemma  gives  the  same  result  as  Pr[67  =  1 1 6  =  1]  =  Pr[67  =  1 1 6  =  0]  =  1/2. 


When  giving  these  problems  it  is  important  to  know  how  they  are  related.  We  relate  them  by 
giving  complexity-theoretic  reductions  from  one  problem  to  another.  This  allows  us  to  say  that 
“Problem  B  is  no  harder  than  Problem  A”.  Assuming  an  oracle  (or  efficient  subroutine)  to  solve 
Problem  A,  we  create  an  efficient  algorithm  for  Problem  B.  The  algorithms  which  perform  these 
reductions  should  be  efficient,  in  that  they  run  in  polynomial  time,  where  we  treat  each  oracle 
query  as  a  single  time  unit. 

We  can  also  show  equivalence  between  two  problems  A  and  B,  by  showing  an  efficient  reduction 
from  A  to  B  and  an  efficient  reduction  from  B  to  A.  If  the  two  reductions  are  both  polynomial-time 
reductions  then  we  say  that  the  two  problems  are  polynomial-time  equivalent.  The  most  important 
result  of  this  form  for  our  factoring  related  problems  is  the  following. 

Theorem  2.4.  The  FACTOR  and  SQRROOT  problems  are  polynomial-time  equivalent. 

The  next  two  lemmas  present  reductions  in  both  directions.  By  examing  the  proofs  it  is  easy  to  see 
that  both  of  the  reductions  can  be  performed  in  expected  polynomial  time.  Hence,  the  problems 
FACTOR  and  SQRROOT  are  polynomial-time  equivalent.  First,  in  the  next  lemma,  we  show  how 
to  reduce  SQRROOT  to  FACTOR;  if  there  is  no  algorithm  which  can  solve  SQRROOT  then  there 
is  no  algorithm  to  solve  FACTOR. 

Lemma  2.5.  If  A  is  an  algorithm  which  can  factor  integers  of  size  v,  then  there  is  an  efficient 
algorithm  B  which  can  solve  SQRROOT  for  integers  of  size  v.  In  particular 

AdvRACT0R(A)  =  Adv^QRR00T(S). 
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Proof.  Assume  we  are  given  a  factoring  algorithm  A;  we  wish  to  show  how  to  use  this  to  extract 
square  roots  modulo  a  composite  number  N.  Namely,  given 

a  —  x2  (mod  N) 

we  wish  to  compute  x.  First  we  factor  N  into  its  prime  factors  pi,  P2,  •  •  • ,  P&,  using  the  factoring 
oracle  A.  Then  we  compute 

Si  <—  yfa  (mod  pi )  for  1  <  i  <  k. 

This  can  be  done  in  expected  polynomial  time  using  Shanks’  Algorithm  (Algorithm  1.3  from  Chap¬ 
ter  1).  Then  we  compute  the  value  of  x  using  the  Chinese  Remainder  Theorem  on  the  data 

(^ljPl)j  •  •  •  5 

We  have  to  be  a  little  careful  if  powers  of  pi  greater  than  one  divide  N.  However,  this  is  easy  to 
deal  with  and  will  not  concern  us  here,  since  we  are  mainly  interested  in  integers  N  which  are  the 
product  of  two  primes.  Hence,  finding  square  roots  modulo  N  is  no  harder  than  factoring. 

The  entire  proof  can  be  represented  diagramatically  in  terms  of  our  game  diagrams  as  in  Figure 
2.5;  where  we  have  specialized  the  game  to  one  of  integers  N  which  are  the  product  of  two  prime 
factors. 


p,  q  <—  {u/2-bit  primes} 
N  p  •  q 
a  4—  Q  tv 

N  - 


a 


x 


B 


p,q  - 

’p 


-►  St,  4—  \[a  (mod  p) 


A 


By  Shanks’  Algorithm 


-►  sq  <—  yja  (mod  q) 


By  Shanks’  Algorithm 
x  <r-  CRT({sp,p},{sq,q}) 


Figure  2.5.  Constructing  an  algorithm  B  to  solve  SQRROOT  from  an  algorithm 
A  to  solve  FACTOR 


□ 

We  now  show  how  to  reduce  FACTOR  to  SQRROOT;  if  there  is  no  algorithm  which  can  solve 
FACTOR  then  there  is  no  algorithm  to  solve  SQRROOT. 

Lemma  2.6.  Let  A  be  an  algorithm  which  can  solve  SQRROOT  for  integers  of  size  v;  then  there 
is  an  efficient  algorithm  B  which  can  factor  integers  of  size  v.  In  particular  for  N  a  product  of  two 
primes  we  have 

AdvpRR00T(/l)  =  2  •  AdvRACT0R(S). 

The  proof  of  this  result  contains  an  important  tool  used  in  the  factoring  algorithms  of  the  next 
section,  namely  the  construction  of  a  difference  of  two  squares. 

Proof.  Assume  we  are  given  an  algorithm  A  for  extracting  square  roots  modulo  a  composite 
number  N.  We  shall  assume  for  simplicity  that  N  is  a  product  of  two  primes,  which  is  the  most 
difficult  case.  The  general  case  is  only  slightly  more  tricky  mathematically,  but  it  is  computationally 
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easier  since  factoring  numbers  with  three  or  more  prime  factors  is  usually  easier  than  factoring 
numbers  with  two  prime  factors. 

We  wish  to  use  our  algorithm  A  for  the  problem  SQRROOT  to  factor  the  integer  N  into  its 
prime  factors,  i.e.  given  N  =  p  •  q  we  wish  to  compute  p.  First  we  pick  a  random  x  E  (Z/iVZ)*  and 
compute 

a  <—  x2  (mod  N). 


Now  we  compute 


y  <—  y/a  (mod  N) 


using  the  SQRROOT  algorithm.  There  are  four  such  square  roots,  since  N  is  a  product  of  two 
primes.  With  fifty  percent  probability  we  obtain 


y  7^  (mod  N). 


If  we  do  not  obtain  this  inequality  then  we  abort. 

We  now  assume  that  the  inequality  holds,  but  we  note  that  we  have  the  equality  x 2  =  y 2 
(mod  N).  It  is  then  easy  to  see  that  N  divides 

x2  -y2  =  (x-y)(x  +  y). 


But  N  does  not  divide  either  x  —  y  or  x  +  t/,  since  y  7^  (mod  N).  So  the  factors  of  N  must  be 
distributed  over  x  —  y  and  x-\-y.  This  means  we  can  obtain  a  non-trivial  factor  of  N  by  computing 
gcd(x  —  y,  N ) 

It  is  because  of  the  above  fifty  percent  probability  that  we  get  a  factor  of  two  in  our  advantage 
statement,  since  B  is  only  successful  if  A  is  successful  and  we  obtain  y  7^  =bx  (mod  N).  Thus 
Pr[F>  wins]  =  Pi[A  wins]/2.  Diagrammatically  we  represent  this  reduction  in  Figure  2.6. 


N 

N 


V 


Figure  2.6.  Constructing  an  algorithm  B  to  solve  FACTOR  from  an  algorithm  A 
to  solve  SQRROOT 


□ 

Before  leaving  the  problem  SQRROOT,  note  that  QUADRES  is  easier  than  SQRROOT,  since 
an  algorithm  to  compute  square  roots  modulo  N  can  trivially  be  used  to  determine  quadratic 
residuosity. 

Finally  we  end  this  section  by  showing  that  the  RSA  problem  can  be  reduced  to  FACTOR. 
Recall  the  RSA  problem  is  given  c  =  me  (mod  N),  find  m.  There  is  some  evidence,  although  slight, 
that  the  RSA  problem  may  actually  be  easier  than  FACTOR  for  some  problem  instances.  It  is  a 
major  open  question  as  to  how  much  easier  it  is. 
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Lemma  2.7.  The  RSA  problem  is  no  harder  than  the  FACTOR  problem.  In  particular ,  if  A  is 
an  algorithm  which  can  solve  FACTOR  for  integers  of  size  v,  then  there  is  an  efficient  algorithm 
B  which  can  solve  the  RSA  problem  for  integers  of  size  v.  In  particular  for  N  a  product  of  two 
primes  we  have 

AdAACr0RO)  =  Ad  v£sa(£). 

Proof.  Using  the  factoring  algorithm  A  we  first  find  the  factorization  of  N.  We  can  now  compute 
<f>  =  <f(N)  and  then  compute 

d  ^  1/e  (mod  <f>). 

Once  d  has  been  computed  it  is  easy  to  recover  m  via 

cd  =  me'd  =  to1  (mod<I>)=TO  (mod  N), 

with  the  last  equality  following  by  Lagrange’s  Theorem,  Theorem  1.4.  Hence,  the  RSA  problem  is 
no  harder  than  FACTOR.  We  leave  it  to  the  reader  to  present  a  diagram  of  this  reduction  similar 
to  the  ones  above.  □ 


2.3.  Basic  Factoring  Algorithms 

Finding  factors  is  an  expensive  computational  operation.  To  measure  the  complexity  of  algorithms 

to  factor  an  integer  N  we  often  use  the  function 

LN{a,P)  =  exp  ((/3  +  o(l))(logAf)a(loglogAf)1_Q)  . 

Note  that 

•  L/v( 0,,d)  =  (logA)^1),  i.e.  essentially  polynomial  time, 

•  L/v(  1,/d)  =  Nt3Jro(A\  i.e.  essentially  exponential  time. 

So  in  some  sense,  the  function  Ljsf(a,  (3)  interpolates  between  polynomial  and  exponential  time.  An 

algorithm  with  complexity  0(L]y(a,  (3))  for  0  <  a  <  1  is  said  to  have  sub-exponential  behaviour. 

Note  that  multiplication,  which  is  the  inverse  algorithm  to  factoring,  is  a  very  simple  operation 

requiring  time  less  than  O(L/v(0,  2)). 

There  are  a  number  of  methods  to  factor  numbers  of  the  form 

N  =  p  •  q. 

For  now  we  just  summarize  the  most  well-known  techniques. 

•  Trial  Division:  Try  every  prime  number  up  to  y/N  and  see  whether  it  is  a  factor  of  N . 
This  has  complexity  Ln{  1, 1),  and  is  therefore  an  exponential  algorithm. 

•  Elliptic  Curve  Method:  This  is  a  very  good  method  if  p  <  250;  its  complexity  is 
Lp(l/2,c),  for  some  constant  c,  which  is  a  sub-exponential  function.  Note  that  the  com¬ 
plexity  is  given  in  terms  of  the  size  of  the  smallest  unknown  prime  factor  p.  If  the  number 
is  a  product  of  two  primes  of  very  unequal  size  then  the  elliptic  curve  method  may  be  the 
best  at  finding  the  factors. 

•  Quadratic  Sieve:  This  is  probably  the  fastest  method  for  factoring  integers  that  have 
between  80  and  100  decimal  digits.  It  has  complexity  Ljv(1/2,1). 

•  Number  Field  Sieve:  This  is  currently  the  most  successful  method  for  numbers  with 
more  than  100  decimal  digits.  It  has  factored  numbers  of  size  10155  ~  2512  and  has 
complexity  L/v(  1/3, 1.923). 

Factoring  methods  are  usually  divided  into  Dark  Age  methods  such  as 

•  Trial  division, 

•  p  —  1  method, 

•  p  +  1  method, 

•  Pollard  rho  method, 

and  modern  methods  such  as 
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•  Continued  Fraction  Method  (CFRAC), 

•  Quadratic  Sieve  (QS), 

•  Elliptic  Curve  Method  (ECM), 

•  Number  Field  Sieve  (NFS). 

We  do  not  have  space  to  discuss  all  of  these  in  detail  so  we  shall  look  at  a  couple  of  Dark  Age 
methods  and  explain  the  main  ideas  behind  some  of  the  modern  methods. 

2.3.1.  Trial  Division:  The  most  elementary  algorithm  is  trial  division,  which  we  have  already 
met  in  the  context  of  testing  primality.  Suppose  N  is  the  number  we  wish  to  factor;  we  proceed  as 
described  in  Algorithm  2.4.  A  moment’s  thought  reveals  that  trial  division  takes  time  at  worst 

o(Vn)  =  o  0(log2  N)/2) . 

The  input  size  to  the  algorithm  is  of  size  log2  N,  hence  this  complexity  is  exponential.  But  just 
as  in  primality  testing,  we  should  not  ignore  trial  division.  It  is  usually  the  method  of  choice  for 
numbers  less  than  1012. 


Algorithm  2.4:  Factoring  via  trial  division 

for  p  =  2  to  y/N  do 
e  0. 

if  ( N  mod  p)  =  0  then 

while  (N  mod  p)  =  0  do 
e  i —  e  T  1. 

|_  N  <r-  N/p. 

output  (p,  e). 


2.3.2.  Smooth  Numbers:  For  larger  numbers  we  would  like  to  improve  on  the  trial  division 
algorithm.  Almost  all  other  factoring  algorithms  make  use  of  other  auxiliary  numbers  called  smooth 
numbers.  Essentially  a  smooth  number  is  one  which  is  easy  to  factor  using  trial  division;  the 
following  definition  makes  this  more  precise. 

Definition  2.8  (Smooth  Number).  Let  B  be  an  integer.  An  integer  N  is  called  B -smooth  if  every 
prime  factor  p  of  N  is  less  than  B. 


For  example 


N 


•  11 


3 


is  12-smooth.  Sometimes  we  say  that  the  number  is  just  smooth  if  the  bound  B  is  small  compared 
with  N.  The  number  of  p-smooth  numbers  which  are  less  than  x  is  given  by  the  function  ip(x,y). 
This  is  a  rather  complicated  function  which  is  approximated  by 


rp{x,y)  xp(u) 


where  p  is  the  Dickman-de  Bruijn  function  and 

logx 

u  =  - - . 

lo  gy 

The  Dickman-de  Bruijn  function  p  is  defined  as  the  function  which  satisfies  the  following  differential- 
delay  equation 

u  •  p  (u)  +  p(u  —  1)  =  0, 
for  u  >  1.  In  practice  we  approximate  p(u)  via  the  expression 


u 


—u 


5 
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which  holds  as  u  -A  oo.  This  leads  to  the  following  result,  which  is  important  in  analysing  advanced 
factoring  algorithms. 

Theorem  2.9.  The  proportion  of  integers  less  than  x  which  are  x1^ -smooth  is  asymptotically 
equal  to  u~u. 

Now  if  we  set  y  =  Ljsr(a,  f3)  then 

log  N 

lo  gy 

=  1  (  log  N  V~a 

P  \loglogN  ) 

Hence,  we  can  show 

fip(N,y)  «  u~u 

=  exp( — u  •  log  u) 

_  1 

Ln(  1  -  <T7 )’ 

for  some  constant  7. 

Suppose  we  are  looking  for  numbers  less  than  N  which  are  /3) -smooth.  The  probability 

that  any  number  less  than  N  is  actually  Ljy(a,/3)~ smooth  is,  as  we  have  seen,  given  by  1/Ljv(1  — 
0,7).  This  explains  intuitively  why  some  of  the  modern  method  complexity  estimates  for  factoring 
are  around  L/v(0.5,c),  since  to  balance  the  smoothness  bound  against  the  probability  estimate  we 
take  a  =  The  Number  Field  Sieve  only  obtains  a  better  complexity  estimate  by  using  a  more 
mathematically  complex  algorithm. 

We  shall  also  require,  in  discussing  our  next  factoring  algorithm,  the  notion  of  a  number  being 
5-power  smooth: 

Definition  2.10  (Power  Smooth).  A  number  is  said  to  be  B -power  smooth  if  every  prime  power 
dividing  N  is  less  than  B . 

For  example  N  =  25  •  33  is  33-power  smooth. 

2.3.3.  Pollard’s  P  —  1  Method:  The  most  famous  name  in  factoring  algorithms  in  the  late 
twentieth  century  was  John  Pollard.  Almost  all  the  important  advances  in  factoring  were  made  by 
him,  for  example 

•  The  P  —  1  method, 

•  The  Rho-method, 

•  The  Number  Field  Sieve. 

In  this  section  we  discuss  the  P  —  1  method  and  in  a  later  section  we  consider  the  Number  Field 
Sieve  method. 

Suppose  the  number  we  wish  to  factor  is  given  by  N  =  p  •  q.  In  addition  suppose  we  know  (by 
some  pure  guess)  an  integer  B  such  that  p  —  1  is  5-power  smooth,  but  that  <7  —  1  is  not  5-power 
smooth.  We  can  then  hope  that  p  —  1  divides  5!,  but  <7  —  1  is  unlikely  to  divide  5!. 

Suppose  that  we  compute 

a  2B'  (mod  N). 

Imagine  that  we  could  compute  this  modulo  p  and  modulo  7,  we  would  then  have 

a  —  1  (mod  p\ 

since 

•  p  —  1  divides  5!, 

•  ap~l  =  1  (mod  p)  by  Fermat’s  Little  Theorem. 
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Algorithm  2.5:  Pollard’s  P  —  1  factoring  method 
a  i —  2. 

for  j  =  2  to  B  do 

L  a  <—  aJ  mod  A. 
p  gcd(a  —  1,  A). 

if  p  7^  1  and  p  7^  A  then  return  up  is  a  factor  of  A” . 
else  return  “No  Result”. 


But  it  is  unlikely  that  we  would  have  a  =  1  (mod  q).  Hence, 

•  p  will  divide  a  —  1, 

•  q  will  not  divide  a  —  1. 

We  can  then  recover  p  by  computing  p  =  gcd(a  —  1,  A),  as  in  Algorithm  2.5 

As  an  example,  suppose  we  wish  to  factor  A  =  15  770  708  441.  We  take  B  =  180  and  running 
the  above  algorithm  we  obtain 

a  =  2b!  (mod  N)  =  1  162  022  425. 

Then  we  obtain 

p  =  gcd(a  —  1,  A)  =  135  979. 

To  see  why  this  works  in  this  example  we  see  that  the  prime  factorization  of  A  is  given  by 

A  =  135  979-115  979 

and  we  have 


p  -  l  =  135  978  -  1  =  2  •  3  •  131  •  173, 
q  -  1  =  115  978  -  1  =  2  •  103  •  563. 

Hence  p  —  1  is  indeed  R-power  smooth,  whilst  <7  —  1  is  not  R-power  smooth. 

One  can  show  that  the  complexity  of  the  P  —  1  method  is  given  by 

0(B  ■  log  B  ■  (log  N)2  +  (log  IV)3). 

So  if  we  choose  B  =  0((logA)2),  for  some  integer  i,  then  this  is  a  polynomial-time  factoring 
algorithm,  but  it  only  works  for  numbers  of  a  special  form. 

Due  to  the  P  —  1  method  we  often  see  it  recommended  that  RSA  primes  are  chosen  to  satisfy 

p  —  1  =  2  •  pi  and  q  —  1  =  2  •  q\ , 

where  p\  and  q\  are  both  primes.  In  this  situation  the  primes  p  and  q  are  called  safe  primes.  For 
a  random  1024-bit  prime  p  the  probability  that  p  —  1  is  R-power  smooth,  for  a  small  value  of  R, 
is  very  small.  Hence,  choosing  random  1024-bit  primes  would  in  all  likelihood  render  the  P  —  1 
method  useless,  and  so  choosing  p  to  be  a  safe  prime  is  not  really  needed. 

2.3.4.  Difference  of  Two  Squares:  A  basic  trick  in  factoring  algorithms,  known  for  many  cen¬ 
turies,  is  to  produce  two  numbers  x  and  y,  of  around  the  same  size  as  A,  such  that 

x2  =  y 2  (mod  A). 

Since  then  we  have 

x2  —  y2  =  (x  —  y)  •  {x  +  y)  =  0  (mod  A). 

If  A  =  p  •  q  then  we  have  four  possible  cases 

(1)  p  divides  x  —  y  and  q  divides  x  +  y- 

(2)  p  divides  x  +  y  and  q  divides  x  —  y. 

(3)  p  and  q  both  divide  x  —  y  but  neither  divides  x  +  y • 

(4)  p  and  q  both  divide  x  +  y  but  neither  divides  x  —  y. 
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All  these  cases  can  occur  with  equal  probability,  namely  If  we  then  compute 

d  =  gcd(x  —  y,  A"), 

our  previous  four  cases  then  divide  into  the  cases 

(1)  d  =  p. 

(2)  d  =  q. 

(3)  d  =  N. 

(4)  d=  1. 

Since  all  these  cases  occur  with  equal  probability,  we  see  that  with  probability  \  we  will  obtain  a 
non-trivial  factor  of  A.  The  only  problem  is,  how  do  we  find  x  and  y  such  that  x 2  =  y 2  (mod  A)? 


2.4.  Modern  Factoring  Algorithms 

Most  modern  factoring  methods  use  the  following  strategy  based  on  the  difference-of-two-squares 
method  described  at  the  end  of  the  last  section. 

•  Take  a  smoothness  bound  B. 

•  Compute  a  factorbase  F  of  all  prime  numbers  p  less  than  B. 

•  Find  a  large  number  of  values  of  x  and  y  such  that  x  and  y  are  5-smooth  and 

x  —  y  (mod  A). 

These  are  called  relations  on  the  factorbase. 

•  Using  linear  algebra  modulo  2,  find  a  combination  of  the  relations  to  give  an  X  and  Y 
with 

X2  =  y2  (mod  N). 

•  Attempt  to  factor  N  by  computing  gcd ( X  —  Y,N). 

The  trick  in  all  algorithms  of  this  form  is  how  to  find  the  relations.  All  the  other  details  of  the 
algorithms  are  basically  the  same.  Such  a  strategy  can  be  used  to  solve  discrete  logarithm  problems 
as  well,  which  we  shall  discuss  in  Chapter  3.  In  this  section,  we  explain  the  parts  of  the  modern 
factoring  algorithms  which  are  common  and  justify  why  they  work. 

One  way  of  looking  at  such  algorithms  is  in  the  context  of  computational  group  theory.  The 
factorbase  is  essentially  a  set  of  generators  of  the  group  (Z/AZ)*,  whilst  the  relations  are  relations 
between  the  generators  of  this  group.  Once  a  sufficiently  large  number  of  relations  have  been 
found,  since  the  group  is  a  finite  abelian  group,  standard  group-theoretic  algorithms  will  compute 
the  group  structure  and  hence  the  group  order.  From  the  group  order  0(A)  =  (p  —  1  )(q  —  1),  we 
are  able  to  factor  the  integer  A.  These  general  group-theoretic  algorithms  could  include  computing 
the  Smith  Normal  Form  of  the  associated  matrix.  Hence,  it  should  not  be  surprising  that  linear 
algebra  is  used  on  the  relations  to  factor  the  integer  A. 

Combining  Relations:  The  Smith  Normal  Form  algorithm  is  far  too  complicated  for  factoring 
algorithms  where  a  more  elementary  approach  can  be  used,  still  based  on  linear  algebra,  as  we  shall 
now  explain.  Suppose  we  have  the  relations 

p2  •  q3  •  r2  =  p3  •  g4  •  r3  (mod  A), 

n  k  q 

p  •  q6  •  A  =  p  •  q  -  r  (mod  A), 
p3  •  q3  •  r3  =  p  •  q3  •  r2  (mod  A), 
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where  p,  q  and  r  are  primes  in  our  factorbase,  F  =  {p,  q,  r}.  Dividing  one  side  by  the  other  in  each 
of  our  relations  we  obtain 


p~l  •  q  •  r-1  =  1  (mod  iV), 
g2  •  r3  =  1  (mod  TV), 
p2  •  q2  -  r  =  1  (mod  TV). 

Multiplying  the  last  two  equations  together  we  obtain 

p0+2  •  g2+2  •  r3+1  =  1  (mod  TV). 


In  other  words 

Hence  if  X  =  p  •  q2  •  r2  and  T 


p2  •  g4  •  r4  =  1  (mod  TV). 
1  then  we  obtain 

X2  =  Y2  (mod  TV) 


as  required  and  computing 

gcd(X  -  Y,  TV) 

will  give  us  a  fifty  percent  chance  of  factoring  TV. 

Whilst  it  was  easy  to  see  by  inspection  in  the  previous  example  how  to  combine  the  relations 
to  obtain  a  square,  in  a  real-life  example  our  factorbase  could  consist  of  hundreds  of  thousands  of 
primes  and  we  would  have  hundreds  of  thousands  of  relations.  We  basically  need  a  technique  to 
automate  this  process  of  finding  out  how  to  combine  relations  into  squares.  This  is  where  linear 
algebra  can  come  to  our  aid. 

We  explain  how  to  automate  the  process  using  linear  algebra  by  referring  to  our  previous  simple 
example.  Recall  that  our  relations  were  equivalent  to 

p~1  •  q  •  r-1  =  1  (mod  TV), 

q2  •  r3  =  1  (mod  TV), 

p2  •  q2  •  r  =  1  (mod  TV). 


To  find  which  equations  to  multiply  together  to  obtain  a  square,  we  take  a  matrix  A  with 
columns  and  number  of  rows  equal  to  the  number  of  relations.  Each  relation  is  coded  into  the 
matrix  as  a  row,  modulo  two,  which  in  our  example  becomes 

/  -1  1  1  \  /  1 

3  =  0 


A  = 


\ 


0 

2 


1 

2 

2 


1  /  V  0  0  1  / 


1 

0 

0 


1\ 
1 


(mod  2), 


We  now  try  to  find  a  (non-zero)  binary  vector  z  such  that 

z  ■  A  =  0  (mod  2) 


In  our  example  we  can  take 


z 


(0,1,1) 


since 


0  1  1 


(  1 
0 


1 

0 

0 


1  \ 


1 


0  0  0)  (mod  2). 


\  0  0  1  / 


This  solution  vector  z  =  (0, 1, 1)  tells  us  that  multiplying  the  last  two  equations  together  will 
produce  a  square  modulo  TV. 


44 


2.  PRIMALITY  TESTING  AND  FACTORING 


Finding  the  vector  z  is  done  using  a  variant  of  Gaussian  Elimination.  Hence  in  general  this 
means  that  we  require  more  equations  (i.e.  relations)  than  elements  in  the  factorbase.  This  relation¬ 
combining  stage  of  factoring  algorithms  is  usually  the  hardest  part  since  the  matrices  involved  tend 
to  be  rather  large.  For  example  using  the  Number  Field  Sieve  to  factor  a  100-decimal-digit  number 
may  require  a  matrix  of  dimension  over  100  000.  This  results  in  huge  memory  problems  and  requires 
the  writing  of  specialist  matrix  code  and  often  the  use  of  specialized  super  computers. 

The  matrix  will  have  around  500  000  rows  and  as  many  columns,  for  cryptographically  inter¬ 
esting  numbers.  As  this  is  nothing  but  a  matrix  modulo  2  each  entry  could  be  represented  by  a 
single  bit.  If  we  used  a  dense  matrix  representation  then  the  matrix  alone  would  occupy  around  29 
gigabytes  of  storage.  Luckily  the  matrix  is  very,  very  sparse  and  so  the  storage  will  not  be  so  large. 

As  we  said  above,  we  can  compute  the  vector  z  such  that  z  •  A  =  0  using  a  variant  of  Gaussian 
Elimination  over  Z/2Z.  But  standard  Gaussian  Elimination  would  start  with  a  sparse  matrix  and 
end  up  with  an  upper  triangular  dense  matrix,  so  we  would  be  back  with  the  huge  memory  problem 
again.  To  overcome  this  problem  very  advanced  matrix  algorithms  are  deployed  that  try  not  to 
alter  the  matrix  at  all.  We  do  not  discuss  these  here  but  refer  the  interested  reader  to  the  book  of 
Lenstra  and  Lenstra  mentioned  in  the  Further  Reading  section  of  this  chapter.  The  only  thing  we 
have  not  sketched  is  how  to  find  the  relations,  a  topic  which  we  shall  discuss  in  the  next  section. 

2.5.  Number  Field  Sieve 

The  Number  Field  Sieve  is  the  fastest  known  factoring  algorithm.  The  basic  idea  is  to  factor  a 
number  N  by  finding  two  integers  x  and  y  such  that 

x 2  =  y 2  (mod  N ); 

we  then  expect  (hope)  that  gcd(x  —  y ,  N)  will  give  us  a  non-trivial  factor  of  N .  To  explain  the  basic 
method  we  shall  start  with  the  linear  sieve  and  then  show  how  this  is  generalized  to  the  Number 
Field  Sieve.  The  linear  sieve  is  not  a  very  good  algorithm  but  it  does  show  the  rough  method. 

2.5.1.  The  Linear  Sieve:  We  let  F  denote  a  set  of  “small”  prime  numbers  which  form  the 
factorbase: 

F  =  {p  :  p  <  B}. 

A  number  which  factorizes  with  all  its  factors  in  F  is  therefore  H-smooth.  The  idea  of  the  linear 
sieve  is  to  find  many  pairs  of  integers  a  and  A  such  that 

b  =  a  +  N  •  A 

is  H-smooth.  If  in  addition  we  only  select  values  of  a  which  are  “small” ,  then  we  would  expect  that 
a  will  also  be  H-smooth  and  we  could  write 

a  =  JJ  pap 
pEF 

and 

b  =  a  +  N  •  A  =  Pbp  • 

p£F 

We  would  then  have  a  relation  in  Z/7VZ 

pap  =  pbp  (mod  N ). 
pEF  peF 

So  the  main  question  is  how  do  we  find  such  values  of  a  and  A? 

(1)  Fix  a  value  of  A  to  consider. 

(2)  Initialize  an  array  of  length  A  +  1  indexed  by  0  to  A  with  zeros,  for  some  value  of  A. 

(3)  For  each  prime  p  G  F  add  log2p  to  every  array  location  whose  position  is  congruent  to 
—A  •  N  (mod  p). 
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(4)  Choose  the  candidates  for  a  to  be  the  positions  of  those  elements  that  exceed  some  thresh¬ 
old  bound. 

The  reasoning  behind  this  method  is  that  a  position  of  the  array  that  has  an  entry  exceeding  some 
bound  will  have  a  good  chance  of  being  5-smooth,  when  added  to  ATV,  as  it  is  likely  to  be  divisible 
by  many  primes  in  F .  This  is  yet  another  application  of  the  Sieve  of  Eratosthenes. 


Linear  Sieve  Example:  For  example  suppose  we  take  N  =  1159,  F  =  {2,  3,  5,  7, 11}  and  A  =  —2. 
So  we  wish  to  find  a  smooth  value  of 

a  —  2N. 


We  initialize  the  sieving  array  as  follows: 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

We  now  take  the  first  prime  in  5,  namely  p  =  2,  and  we  compute  — A  •  TV  (mod  p)  =  0.  So  we  add 
log2(2)  =  1  to  every  array  location  with  index  equal  to  0  modulo  2.  This  results  in  our  sieve  array 
becoming: 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

1.0 

0.0 

1.0 

0.0 

1.0 

0.0 

1.0 

0.0 

1.0 

0.0 

We  now  take  the  next  prime  in  5,  namely  p  =  3,  and  compute  — A  •  TV  (mod  p)  =  2.  So  we  add 
log2(3)  =  1.6  to  every  array  location  with  index  equal  to  2  modulo  3.  Our  sieve  array  then  becomes: 


0 

1  2 

3 

4 

5 

6 

7 

8 

9 

1.0 

0.0  2.6 

0.0 

1.0 

1.6 

1.0 

0.0 

2.6 

0.0 

with 

p  =  5,  7  and  11, 

eventually  the  sieve  array  bee 

0 

1  2 

3 

4 

5 

6 

7 

8 

9 

1.0 

2.8  2.6 

2.3 

1.0 

1.6 

1.0 

0.0 

11.2 

0.0 

Hence,  the  value  a  —  8  looks  like  it  should  correspond  to  a  smooth  value,  and  indeed  it  does,  since 
we  find 


a  —  A  •  TV  =  8-  2-  1159  =  -2310  =  -2  •  3  •  5  •  7  •  11. 


So  using  the  linear  sieve  we  obtain  a  large  collection  of  numbers,  cq  and  5^,  such  that 

di  =  ft  pp  =  P  pj'3  =  bi  (mod  N). 

PjeF  PjeF 


We  assume  that  we  have  at  least 
zth  row  being 


5+1  such  relations  with  which  we  then  form  a  matrix  with  the 


(*++  •  •  •  5  •  •  •  5  bi,t)  (mod  2). 

We  then  find  elements  of  the  kernel  of  this  matrix  modulo  2.  This  will  tell  us  how  to  multiply  the 
di  and  the  bi  together  to  obtain  elements  x2  and  y 2  such  that  x,  y  E  Z  are  easily  calculated  and 

x2  =  y2  (mod  TV). 


We  can  then  try  to  factor  TV,  but  if  these  values  of  x  and  y  do  not  provide  a  factor  we  just  find  a 
new  element  in  the  kernel  of  the  matrix  and  continue. 

The  basic  linear  sieve  gives  a  very  small  yield  of  relations.  There  is  a  variant  called  the  large 
prime  variation  which  relaxes  the  sieving  condition  to  allow  through  pairs  a  and  b  which  are  almost 
5-smooth,  bar  say  a  single  “large”  prime  in  a  and  a  single  “large”  prime  in  b.  These  large  primes 
then  have  to  be  combined  in  some  way  so  that  the  linear  algebra  step  can  proceed  as  above.  This  is 
done  by  constructing  a  graph  and  using  an  algorithm  which  computes  a  basis  for  the  set  of  cycles 
in  the  graph.  The  basic  idea  for  the  large  prime  variation  originally  arose  in  the  context  of  the 
quadratic  sieve  algorithm,  but  it  can  be  applied  to  any  of  the  sieving  algorithms  used  in  factoring. 
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It  is  clear  that  the  sieving  could  be  carried  out  in  parallel,  hence  the  sieving  can  be  parcelled  out 
to  lots  of  slave  computers  around  the  world.  The  slaves  then  communicate  any  relations  they  find 
to  the  central  master  computer  which  performs  the  linear  algebra  step.  In  such  a  way  the  Internet 
can  be  turned  into  a  large  parallel  computer  dedicated  to  factoring  numbers.  As  we  have  already 
remarked,  the  final  (linear  algebra)  step  often  needs  to  be  performed  on  specialized  equipment  with 
large  amounts  of  disk  space  and  RAM,  so  this  final  computation  cannot  be  distributed  over  the 
Internet. 


2.5.2.  Higher-Degree  Sieving:  The  linear  sieve  is  simply  not  good  enough  to  factor  large  num¬ 
bers.  Indeed,  the  linear  sieve  was  never  proposed  as  a  real  factoring  algorithm,  but  its  operation 
is  instructive  for  other  algorithms  of  this  type.  The  Number  Field  Sieve  (NFS)  uses  the  arithmetic 
of  algebraic  number  fields  to  construct  the  desired  relations  between  the  elements  of  the  factor- 
base.  All  that  changes  is  the  way  the  relations  are  found.  The  linear  algebra  step,  the  large  prime 
variations  and  the  slave/master  approach  all  go  over  to  NFS  virtually  unchanged.  We  now  explain 
the  NFS,  but  in  a  much  simpler  form  than  is  actually  used  in  real  life  so  as  to  aid  the  exposition. 
Those  readers  who  do  not  know  any  algebraic  number  theory  may  wish  to  skip  this  section. 

First  we  construct  two  monic,  irreducible  polynomials  with  integer  coefficients  fi  and  f2,  of 
degree  d\  and  d 2  respectively,  such  that  there  exists  an  m  E  Z  such  that 

fi  (rn)  =  /2(m)  =  0  (mod  N). 

The  Number  Field  Sieve  will  make  use  of  arithmetic  in  the  number  fields  K\  and  K2  given  by 


Kx  =  Q(0i)  and  K2  =  Q (02), 

where  6\  and  02  are  defined  by  =  f2(02)  =  0.  We  then  have  two  homomorphisms  0i  and  0 2 

given  by 

rz  [0i\  — »z/ivz 

02  •  \  Q 

[  0i  1 — >  m. 

We  aim  to  use  a  sieve,  just  as  in  the  linear  sieve,  to  find  a  set 

S  C  {(a,  b)  E  Z2  :  gcd(a,  b)  =  1} 


such  that 


II(a  -  b-  0i)  =  0‘ 


s 


and 


II(a  -  b-  92)  =  7' 


s 


where  0  E  K\  and  7  E  K2.  If  we  found  two  such  values  of  (3  and  7  then  we  would  have 

4>i  i/3)2  =  <t>2 N)2  (mod  N) 


and  we  hope 


gcd(iV ,  0i(/3)  -  <^2(7)) 


would  be  a  factor  of  N . 

This  leads  to  three  obvious  problems,  which  we  address  in  the  following  three  sub-sections: 

•  How  do  we  find  the  set  SI 

•  Given  f32  E  how  do  we  compute  (31 

•  How  do  we  find  the  polynomials  fi  and  f2  in  the  first  place? 
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How  do  we  find  the  set  5?:  Similar  to  the  linear  sieve  we  can  find  such  a  set  S  using  linear 
algebra  provided  we  can  find  lots  of  a  and  b  such  that 

a  —  b  •  0\  and  a  —  b  •  62 

are  both  “smooth”.  But  what  does  it  mean  for  these  two  objects  to  be  smooth?  This  is  rather 
complicated,  and  for  the  rest  of  this  section  we  will  assume  the  reader  has  a  basic  acquaintance 
with  algebraic  number  theory.  It  is  here  that  the  theory  of  algebraic  number  fields  comes  in:  by 
generalizing  our  earlier  definition  of  smooth  integers  to  algebraic  integers  we  obtain  the  following 
definition: 

Definition  2.11.  An  algebraic  integer  is  “ smooth ”  if  and  only  if  the  ideal  it  generates  is  only 
divisible  by  “small”  prime  ideals. 

Define  Ft(X,  Y )  =  Ydi  ■  fi(X/Y),  then 

N<Q(8i)/<Q(a  -b-di)  =  Fi(a,  b). 

We  define  two  factorbases,  one  for  each  of  the  polynomials 

Ti  =  {(p,  r)  :  p  a  prime,  r  G  Z  such  that  fi(r)  =  0  (mod  p)}  . 

Each  element  of  Ti  corresponds  to  a  degree-one  prime  ideal  of  Z[0J,  which  is  a  sub-order  of  the 
ring  of  integers  of  Oq^y  given  by 

(p,  Oi  -  r)  :=  pZ[0i\  +  (Oi  -  r)Z[0i\. 

Given  values  of  a  and  b  we  can  easily  determine  whether  the  ideal  (a  —  Oi  -  b)  “factorizes”  over  our 
factorbase.  Note  factorizes  is  in  quotes  as  unique  factorization  of  ideals  may  not  hold  in  Z[#J, 
whilst  it  will  hold  in  Oq^Q.y  It  will  turn  out  that  this  is  not  really  a  problem.  To  see  why  this  is 
not  a  problem  you  should  consult  the  book  by  Lenstra  and  Lenstra. 

If  Z [0i\  =  ®Q(9i)  then  the  following  method  does  indeed  give  the  unique  prime  ideal  factorization 
of  (a  —  Oi  •  b) . 

•  Write 

Fi(a,b)  =  pj  . 

(Pj,r)eTi 

•  We  have  (a  :  b)  =  (r  :  1)  (mod  p),  as  an  element  in  the  projective  space  of  dimension 
one  over  ¥p  (i.e.  a/b  =  r  (mod  p)),  if  the  ideal  corresponding  to  (p,r)  is  included  in  a 
non-trivial  way  in  the  ideal  factorization  of  a  —  Ofb. 

•  We  have 

(a  -  Oi  ■  b)  =  q  (pj,  Oi  -  rd  * . 

This  leads  to  the  following  algorithm  to  sieve  for  values  of  a  and  6,  such  that  (a  —  Oi  -  b)  is  an  ideal 
which  factorizes  over  the  factorbase.  Just  as  with  the  linear  sieve,  the  use  of  sieving  allows  us  to 
avoid  lots  of  expensive  trial  divisions  when  trying  to  determine  smooth  ideals.  We  end  up  only 
performing  factorizations  where  we  already  know  we  have  a  good  chance  of  being  successful. 

•  Fix  a. 

•  Initialize  the  sieve  array  for  —B<b<  B  by 

5[6]  =  log2(Fi(a,6).F2(a,6)). 

•  For  every  (p,  r)  G  Ti  subtract  log2p  from  every  array  element  S[b\  where  b  is  such  that 

a  —  r  •  b  =  0  (mod  p). 

•  The  values  of  b  we  want  are  the  ones  such  that  S[b\  lies  below  some  tolerance  level. 
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If  the  tolerance  level  is  set  in  a  sensible  way  then  we  have  a  good  chance  that  both  Fi(a,b)  and 
F2(a,  b )  factor  over  the  prime  ideals  in  the  factorbase,  with  the  possibility  of  some  large  prime  ideals 
creeping  in.  We  keep  these  factorizations  as  a  relation,  just  as  we  did  with  the  linear  sieve. 

Then,  after  some  linear  algebra,  we  can  find  a  subset  S  of  all  the  pairs  (a,  b )  we  have  found 
such  that 

(a  —  bOi)  =  square  of  an  ideal  in  Z[6y. 

( a:b)eS 

However,  this  is  not  good  enough.  Recall  that  we  want  the  product  f|(a  —  b  •  Oi)  to  be  the  square 
of  an  element  of  Z [6i\.  To  overcome  this  problem  we  need  to  add  information  from  the  “infinite” 
places.  This  is  done  by  adding  in  some  quadratic  characters,  an  idea  introduced  by  Adleman.  Let 
q  be  a  rational  prime  (in  neither  T\  nor  F2)  such  that  there  is  an  sq  with  fi(sq)  =  0  (mod  q)  and 
fi(sq)  7^  0  (mod  q )  for  either  i  =  1  or  i  =  2.  Then  our  extra  condition  is  that  we  require 


n 

( a,b)eS 


/a  -b-sq\ 


where  (^)  denotes  the  Legendre  symbol.  As  the  Legendre  symbol  is  multiplicative  this  gives  us  an 
extra  condition  to  put  into  our  matrix.  We  need  to  add  this  condition  for  a  number  of  primes  <7, 
hence  we  choose  a  set  of  such  primes  q  and  put  the  associated  characters  into  our  matrix  as  an 
extra  column  of  Os  or  Is  corresponding  to: 

/a  —  b  •  sq\  I  1  then  enter  0, 

\  q  )  1  —  1  then  enter  1. 

After  finding  enough  relations  we  hope  to  be  able  to  find  a  subset  S  such  that 

]^[(a  —  b  •  61)  =  /? 2  and  ]^[(a  —  b  •  62)  =  y2. 
s  s 


How  do  we  take  the  square  roots?:  We  then  need  to  be  able  to  take  the  square  root  of  f32  to 
recover  /?,  and  similarly  for  y2.  Each  (32  is  given  in  the  form 

d\  —  l 

A  =  E  ■  ei 

3=0 

where  the  a3  are  huge  integers.  We  want  to  be  able  to  determine  the  solutions  bj  G  Z  to  the 
equation 

(di  —  l  \  d\  —  l 

J2bJ-0i)  =  Z  arei- 

j= 0  /  3=0 

One  way  this  is  overcome,  due  to  Couveignes,  is  by  computing  such  a  square  root  modulo  a  large 
number  of  very,  very  large  primes  p.  We  then  perform  Hensel  lifting  and  Chinese  remaindering 
to  hopefully  recover  our  square  root.  This  is  the  easiest  method  to  understand  although  more 
advanced  methods  are  available. 


Choosing  the  initial  polynomials:  This  is  the  part  of  the  method  that  is  a  black  art  at  the 
moment.  We  require  only  the  following  conditions  to  be  met 

fi  (rn)  =  /2(m)  =  0  (mod  N). 

However  there  are  good  heuristic  reasons  why  it  also  might  be  desirable  to  construct  polynomials 
with  additional  properties  such  as 

•  The  polynomials  have  small  coefficients. 
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•  /i  and  /2  have  “many”  real  roots.  Note,  a  random  polynomial  probably  would  have  no 
real  roots  on  average. 

•  fi  and  /2  have  “many”  roots  modulo  lots  of  small  prime  numbers. 

•  The  Galois  groups  of  fi  and  f<i  are  “small” . 

It  is  often  worth  spending  a  few  weeks  trying  to  find  a  good  couple  of  polynomials  before  we  start  to 
attempt  the  factorization  algorithm  proper.  There  are  a  number  of  search  strategies  used  for  finding 
these  polynomials.  Once  a  few  candidates  are  found,  some  experimental  sieving  is  performed  to 
see  which  appear  to  be  the  most  successful,  in  that  they  yield  the  most  relations.  Then,  once  a 
decision  has  been  made  we  can  launch  the  sieving  stage  “for  real” . 


Example:  I  am  grateful  to  Richard  Pinch  for  allowing  me  to  include  the  following  example.  It 
is  taken  from  his  lecture  notes  from  a  course  at  Cambridge  in  the  mid-1990s.  Suppose  we  wish 
to  factor  the  number  N  =  2902  +  1  =  84  101.  We  take  fi(x)  =  x2  +  1  and  f2(x)  —  x  —  290  with 
m  =  290.  Then 


fi  (m)  =  /2(m)  =  0  (mod  N). 


On  one  side  we  have  the  order  Z  [i\  which  is  the  ring  of  integers  of  Q (i)  and  on  the  other  side  we 
have  the  order  Z.  We  obtain  the  following  factorizations: 


X 

V  N(x  -  i-y ) 

Factors 

x  —  rr  i  •  y 

Factors 

-38 

-22 

-1  1445 

-19  845 

5  •  172 

5  •  132 

252 

5488 

22  •  32  ■  7 
24  •  73 

We  then  obtain  the  two  factorizations,  which  are  real  factorizations  of  elements,  as  Z 
factorization  domain. 


is  a  unique 


-38  +  i  =  -(2  +  i)  •  (4  -  i)2  and  -  22  +  19  •  i  =  -(2  +  *)  •  (3  -  2  •  i)2 . 


Hence,  after  a  trivial  bit  of  linear  algebra,  we  obtain  the  following  “squares” 

(-38  +  i)  ■  (-22  +  19  ■  i)  =  (2  +  i)2  ■  (3  -  2  •  i)2  ■  (4  -  i)2  =  (31  -  12  ■  i )2 


and 

(-38  +  to)  ■  (-22  +  19  ■  to)  =  26  ■  32  ■  74  =  11762. 
We  then  apply  the  map  0i  to  31  —  12  •  i  to  obtain 

(31  -  12  •  i)  =  31  -  12  •  m  =  -3449. 

But  then  we  have 

(— 3449)2  =  0i  (31  —  12  •  i)2 

=  </>i((31  —  12  •  i)2) 

=  </>i((— 38 +  i)- (-22 +  19-i)) 

=  <M— 38  +  i)  ■ -M-22  +  19  ■  i) 

=  (—38  +  m)  •  (—22  +  19  •  m)  (mod  N) 

=  11762. 


So  we  compute 


gcd(7V, -3449 +  1176) 


and 

gcd(W, -3449  -  1176) 

Hence  37  and  2273  are  factors  of  N  =  84  101. 


=  2273 

=  37. 
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Chapter  Summary 

•  Prime  numbers  are  very  common  and  the  probability  that  a  random  n-bit  number  is  prime 
is  around  1  jn. 

•  Numbers  can  be  tested  for  primality  using  a  probable  prime  test  such  as  the  Fermat 
or  Miller-Rabin  algorithms.  The  Fermat  Test  has  a  problem  in  that  certain  composite 
numbers  will  always  pass  the  Fermat  Test,  no  matter  how  we  choose  the  possible  witnesses. 

•  If  we  really  need  to  be  certain  that  a  number  is  prime  then  there  are  primality-proving 
algorithms  which  run  in  polynomial  time. 

•  We  introduced  the  problems  FACTOR,  SQRROOT  and  RSA,  and  the  relations  between 
them. 

•  Factoring  algorithms  are  often  based  on  the  problem  of  finding  the  difference  of  two 
squares. 

•  Modern  factoring  algorithms  run  in  two  stages:  In  the  first  stage  we  collect  many  relations 
on  a  factorbase  by  using  a  process  called  sieving,  which  can  be  done  using  thousands  of 
computers  on  the  Internet.  In  the  second  stage  these  relations  are  processed  using  linear 
algebra  on  a  big  central  server.  The  final  factorization  is  obtained  by  finding  a  difference 
of  two  squares. 


Further  Reading 

The  definitive  reference  work  on  computational  number  theory  which  deals  with  many  algorithms 
for  factoring  and  primality  proving  is  the  book  by  Cohen.  The  book  by  Bach  and  Shallit  also 
provides  a  good  reference  for  primality  testing.  The  main  book  explaining  the  Number  Field  Sieve 
is  the  book  by  Lenstra  and  Lenstra. 

E.  Bach  and  J.  Shallit.  Algorithmic  Number  Theory.  Volume  1:  Efficient  Algorithms.  MIT  Press, 
1996. 

H.  Cohen.  A  Course  in  Computational  Algebraic  Number  Theory.  Springer,  1993. 

A.  Lenstra  and  H.  Lenstra.  The  Development  of  the  Number  Field  Sieve.  Springer,  1993. 


CHAPTER  3 


Discrete  Logarithms 


Chapter  Goals 

•  To  examine  algorithms  for  solving  the  discrete  logarithm  problem. 

•  To  introduce  the  Pohlig-Hellman  algorithm. 

•  To  introduce  the  Baby-Step/Giant-Step  algorithm. 

•  To  explain  the  methods  of  Pollard. 

•  To  show  how  discrete  logarithms  can  be  solved  in  finite  fields  using  algorithms  like  those 
used  for  factoring. 

•  To  describe  the  known  results  on  the  elliptic  curve  discrete  logarithm  problem. 

3.1.  The  DLP,  DHP  and  DDH  Problems 

In  Chapter  2  we  examined  the  hard  problem  of  FACTOR.  This  gave  us  some  (hopefully)  one-way 
functions,  namely  the  RSA  function,  the  squaring  function  modulo  a  composite  and  the  function 
which  multiplies  two  large  numbers  together.  Another  important  class  of  problems  are  those  based 
on  the  discrete  logarithm  problem  or  its  variants.  Let  (G,  •)  be  a  finite  abelian  group  of  prime  order 
g,  such  as  a  subgroup  of  the  multiplicative  group  of  a  finite  held  or  the  set  of  points  on  an  elliptic 
curve  over  a  finite  held  (see  Chapter  4).  The  discrete  logarithm  problem,  or  DLP,  in  G  is:  given 
g,h  G  G,  hnd  an  integer  x  G  [0, . . . ,  q)  (if  it  exists)  such  that 

gx  =  h. 

We  write  x  =  dlog  ,(h).  A  diagram  for  the  security  game  for  the  discrete  logarithm  problem  is  given 
in  Figure  3.1,  for  a  group  G  of  prime  order  q.  Just  as  for  our  factoring-based  games  we  dehne  an 
advantage  function  as  the  probability  that  the  adversary  wins  the  game  in  Figure  3.1  for  a  group 
G,  i.e.  Adv§LP(A)  =  Pr[A  wins  the  DLP  game  in  the  group  G\. 


g^G 
x  <—  Z/gZ 
h  <—  gx  — 


Win  if  x'  —  x 


Figure  3.1.  Security  game  to  dehne  the  discrete  logarithm  problem 

For  some  groups  G  this  problem  is  easy.  For  example  if  we  take  G  to  be  the  integers  modulo  a 
number  N  under  addition,  then  given  g,  h  G  Z/WZ  we  need  to  solve 

x  •  g  =  h. 
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We  have  already  seen  in  Chapter  1  that  we  can  easily  tell  whether  such  an  equation  has  a  solution, 
and  determine  its  solution  when  it  does,  using  the  extended  Euclidean  algorithm. 

For  certain  other  groups  determining  discrete  logarithms  is  believed  to  be  hard.  For  example 
in  the  multiplicative  group  of  a  finite  held  the  best  known  algorithm  for  this  task  is  the  Number 
Field  Sieve/Function  Field  Sieve.  The  complexity  of  determining  discrete  logarithms  in  this  case  is 
given  by 

£/v(l/3,c) 

for  some  constant  c,  depending  on  the  type  of  the  finite  held,  e.g.  whether  it  is  a  large  prime  held 
or  an  extension  held  of  small  characteristic. 

For  other  groups,  such  as  elliptic  curve  groups,  the  discrete  logarithm  problem  is  believed  to  be 
even  harder.  The  best  known  algorithm  for  hnding  discrete  logarithms  on  a  general  elliptic  curve 
dehned  over  a  hnite  held  ¥q  is  Pollard’s  Rho  method,  a  fully  exponential  algorithm  with  complexity 

V^=L,(l,l/2). 

Since  determining  elliptic  curve  discrete  logarithms  is  harder  than  in  the  case  of  multiplicative 
groups  of  hnite  helds  we  are  able  to  use  smaller  groups.  This  leads  to  an  advantage  in  key  size. 
Elliptic  curve  cryptosystems  often  have  much  smaller  key  sizes  (say  256  bits)  compared  with  those 
based  on  factoring  or  discrete  logarithms  in  hnite  helds  (where  for  both  the  “equivalent”  recom¬ 
mended  key  size  is  about  2048  bits). 

Later  in  this  chapter  we  survey  the  methods  known  for  solving  the  discrete  logarithm  problem, 

h  =  gx 

in  various  groups  G.  These  algorithms  fall  into  one  of  two  categories:  either  the  algorithms  are 
generic  and  apply  to  any  hnite  abelian  group  or  the  algorithms  are  specihc  to  the  special  group 
under  consideration. 

Just  as  with  the  FACTOR  problem,  where  we  had  a  number  of  related  problems,  with  discrete 
logarithms  there  are  also  related  problems  that  we  need  to  discuss.  Suppose  we  are  given  a  hnite 
abelian  group  (G,  •),  of  prime  order  g,  and  g  G  G.  The  hrst  of  these  is  the  Difhe-Hellman  problem. 

Definition  3.1  (Computational  Difhe-Hellman  Problem  (DHP)).  Given  g  E  G,  a  =  gx  and  b  =  gy , 
for  unknowns  x  and  y  chosen  at  random  from  ThjqL,  find  c  such  that  c  =  gx'y . 


g^G  ^ 

x,  y  <—  TLjqTL 

a  <—  gx  ,5  <—  gv  - ►  ^ 

h  - - 

Win  if  ft  =  gx'y  _ 

Figure  3.2.  Security  game  to  dehne  the  Computational  Difhe-Hellman  problem 

Diagrammatically  we  can  represent  the  associated  security  game  as  in  Figure  3.2,  and  we  dehne 
the  advantage  of  the  adversary  A  in  the  game  by 

Adv§HP(A)  =  Pr[A  wins  the  DHP  game  in  the  group  G}. 

We  hrst  show  how  to  reduce  solving  the  Difhe-Hellman  problem  to  the  discrete  logarithm  problem. 
But  before  doing  so  we  note  that  in  some  groups  there  is  a  more  complicated  argument  to  show 
that  the  DHP  is  in  fact  equivalent  to  the  DLP.  This  is  done  by  producing  a  reduction  in  the  other 
direction  for  the  specihc  groups  in  question. 


3.1.  THE  DLP,  DHP  AND  DDH  PROBLEMS 
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Lemma  3.2.  In  an  arbitrary  finite  abelian  group  G  the  DHP  is  no  harder  than  the  DLP.  In 
particular  for  all  algorithms  A  there  is  an  algorithm  B  such  that 

Advg  LP(A)  =  AdvgHP(£>). 


Proof.  Suppose  we  have  an  oracle/algorithm  A  which  will  solve  the  DLP,  i.e.  on  input  of  h  —  gx 
it  will  return  x.  To  solve  the  DHP  on  input  of  a  =  gx  and  b  =  gy  we  compute 

(1)  £  i —  A(a). 

(2)  c<-  6*. 

(3)  Output  c. 

The  above  reduction  clearly  runs  in  polynomial  time  and  will  compute  the  true  solution  to  the 
DHP,  assuming  algorithm  A  returns  the  correct  value,  i.e.  z  —  x.  Hence,  the  DHP  is  no  harder 
than  the  DLP.  □ 


There  is  a  decisional  version  of  the  DHP  problem,  just  like  there  is  a  decisional  version  QUADRES 
of  the  SQRROOT  problem. 

Definition  3.3  (Decision  Difhe-Hellman  problem  (DDH)).  The  adversary  is  given  g  G  G ,  a  =  gx , 
b  =  gy ,  and  c  =  gz ,  for  unknowns  x,  y  and  z.  The  value  z  is  chosen  by  the  challenger  to  be  equal 
to  x  •  y  with  probability  1/2,  otherwise  it  is  chosen  at  random.  The  goal  of  the  adversary  is  to 
determine  which  case  he  thinks  the  challenger  picked,  i.e.  he  has  to  determine  whether  z  =  x  •  y. 


Diagrammatically  we  can  represent  the  associated  security  game  as  in  Figure  3.3.  But  when  defining 
the  advantage  for  the  DDH  problem  we  need  to  be  a  bit  careful,  as  the  adversary  can  always  win 
with  probability  one  half,  by  just  guessing  the  bit  b  at  random.  This  is  exactly  the  same  situation 
as  we  had  when  looking  at  the  QUADRES  problem  in  Chapter  2.  We  hence  define  the  advantage 

by 


AdvgDH  ( A)  =  2  • 


Pr[A  wins  the  DDH  game  in  G] 


1 

2 


Notice  that  with  this  definition,  just  as  with  the  advantage  in  the  QUADRES  game,  if  the  adversary 
just  guesses  the  bit  with  probability  1/2  then  its  advantage  is  zero  as  we  would  expect,  since 
2  *  |l/2  —  1/2|  =  0.  If  however  the  adversary  is  always  right,  or  indeed  always  wrong,  then  the 
advantage  is  one,  since  2  -  |1  —  1/2 1  =  2  -  |0  —  1/2 1  =  1.  Thus,  the  advantage  is  normalized  to  he 
between  zero  and  one,  with  one  being  always  successful  and  zero  being  no  better  than  random. 
Just  like  Lemma  2.3  we  have  that  another  way  to  write  down  the  advantage  is 


AdvgDH(A) 


Pr  [b'  =  l\b  =  1]  —  Pr  [b'  =  l\b  =  0] 


Figure  3.3.  Security  game  to  define  the  Decision  Difhe-Hellman  problem 
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3.  DISCRETE  LOGARITHMS 


We  now  show  how  to  reduce  the  solution  of  the  Decision  Diffie-Hellman  problem  to  the  Compu¬ 
tational  Diffie-Hellman  problem,  and  hence  using  our  previous  argument  to  the  discrete  logarithm 
problem. 


Lemma  3.4.  In  an  arbitrary  finite  abelian  group  G  the  DDH  is  no  harder  than  the  DHP.  In 
particular  for  all  algorithms  A  there  is  an  algorithm  B  such  that 

AdvgHP(A)  =  AdvgDH(I?). 


Proof.  Suppose  we  have  an  oracle  A  which  on  input  of  gx  and  gy  computes  the  value  of  gx'y .  To 
solve  the  DDH  on  input  of  a  =  gx,b  =  gy  and  c  =  gz  we  compute 

(1)  d  i —  A(a,  bf 

(2)  If  d  —  c  output  one. 

(3)  Else  output  zero. 

Again  the  reduction  clearly  runs  in  polynomial  time,  and  assuming  the  output  of  the  oracle  is 
correct  then  the  above  reduction  will  solve  the  DDH.  □ 


So  the  Decision  Diffie-Hellman  problem  is  no  harder  than  the  Computational  Diffie-Hellman 
problem.  There  are,  however,  some  groups1  in  which  we  can  solve  the  DDH  in  polynomial  time 
but  the  fastest  known  algorithm  to  solve  the  DHP  takes  sub-exponential  time.  Hence,  of  our  three 
discrete-logarithm-based  problems,  the  easiest  is  DDH,  then  comes  DHP  and  finally  the  hardest 
problem  is  DLP. 


3.2.  Pohlig— Heilman 


The  first  observation  to  make  is  that  the  discrete  logarithm  problem  in  a  group  G  is  only  as  hard 
as  the  discrete  logarithm  problem  in  the  largest  subgroup  of  prime  order  in  G.  This  observation 
is  due  to  Pohlig  and  Heilman,  and  it  applies  in  an  arbitrary  finite  abelian  group.  To  explain  the 
Pohlig-Hellman  algorithm,  suppose  we  have  a  finite  cyclic  abelian  group  G  =  (g)  whose  order  is 
given  by 

t 

n  =  #g  =  Y[pT. 

i= 1 

Now  suppose  we  are  given  h  G  (g),  so  there  exists  an  integer  x  such  that 

h  =  gx . 

Our  aim  is  to  find  x  by  first  finding  it  modulo  pf  and  then  using  the  Chinese  Remainder  Theorem 
to  recover  it  modulo  N . 

From  basic  group  theory  we  know  that  there  is  a  group  isomorphism 


<f  :  G  — >  C 1  x  •  •  •  x  C  et , 

p  i  Pt 


where  Cpe 
given  by 


is  a  cyclic  group  of  prime  power  order  pe.  The  projection  of  <f  on  the  component  Cpe  is 


G  — >  C 


v c 


f , — >  fM/pf 


The  map  <1>P  is  a  group  homomorphism,  so  if  we  have  h  =  gx  in  G  then  we  will  have  <l>p(h)  =  <i>p(g)x 
in  Cpe.  But  the  discrete  logarithm  in  Cpe  is  only  determined  modulo  pe.  So  if  we  could  solve  the 
discrete  logarithm  problem  in  C^e,  then  we  would  determine  x  modulo  pe .  Doing  this  for  all  primes 
p  dividing  N  would  allow  us  to  solve  for  x  using  the  Chinese  Remainder  Theorem.  In  summary 
suppose  we  have  some  oracle  0(g ,  h,p,  e)  which  for  g,  h  G  Cpe  will  output  the  discrete  logarithm  of 
h  with  respect  to  g.  We  can  then  solve  for  x  using  Algorithm  3.1. 


l 


For  example  supersingular  elliptic  curves. 


3.2.  POHLIG-HELLMAN 
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Algorithm  3.1:  Algorithm  to  solve  the  DLP  in  a  group  of  order  A,  given  an  oracle  for  DLP 
for  prime  power  divisors  of  N 

S  <-  {}. 

for  all  primes  p  dividing  N  do 

Compute  the  largest  e  such  that  T  =  pe  divides  N. 

91  <r-  gN/T- 
h\  -5—  hN/T . 

Z  <-  0(gi,hi,p,  e). 

_  S^SUfeT)}. 

x  ^CRT(S') 


The  only  problem  is  that  we  have  not  shown  how  to  solve  the  discrete  logarithm  problem  in 
Cpe.  We  shall  now  show  how  this  is  done,  by  reducing  to  solving  e  discrete  logarithm  problems  in 
the  group  Cp.  Suppose  p,  h  E  Cpe  and  there  is  an  x  such  that 

h  =  gx . 

Clearly  x  is  only  defined  modulo  pe  and  we  can  write 

p _ i 

X  =  Xq  +  X\  •£>+•••+  Xe-1  •  p 

We  hnd  xo,  xi, . . .  in  turn,  using  the  following  inductive  procedure.  Suppose  we  know  x7,  the  value 
of  x  modulo  pf',  i.e. 

x  =  xo  H - V  xt-i  •  pt_1. 

We  now  wish  to  determine  xt  and  so  compute  x  modulo  pt+l .  We  write 

x  =  x  +  p  •  y, 


so  we  have  that 
Hence,  if  we  set 


h  =  9X'  ■  c gpt)y . 


hi  =  h  -  g  x  and  pi  =  pp 


then 

hi  =  pi^. 

Now  pi  is  an  element  of  order  so  to  obtain  an  element  of  order  p  and  hence  a  discrete  logarithm 
problem  in  Cp,  we  need  to  raise  the  above  equation  to  the  power  s  =  pe~t~1.  So,  setting 

/i2  =  h\  and  p2  =  pf, 

we  obtain  the  discrete  logarithm  problem  in  Cp  given  by 


fc2=P2aJt. 


So  assuming  we  can  solve  discrete  logarithms  in  Cp  we  can  hnd  X*  and  so  hnd  x. 


Pohlig— Heilman  Example:  We  now  illustrate  this  approach,  assuming  we  can  hnd  discrete  log¬ 
arithms  in  cyclic  groups  of  prime  order.  We  leave  to  the  next  two  sections  how  to  do  this;  for 
now  we  assume  that  it  is  possible.  As  an  example  of  the  Pohlig-Hellman  algorithm,  consider  the 
multiplicative  group  of  the  hnite  held  F397.  This  group  has  order 

N  =  396  =  22  •  32  •  11 


and  a  generator  of  Fg97  is  given  by 


g  =  5. 


56 


3.  DISCRETE  LOGARITHMS 


We  wish  to  solve  the  discrete  logarithm  problem  given  by 

h  =  208  =  5X  (mod  397). 

We  first  reduce  to  the  three  subgroups  of  prime  power  order,  by  raising  the  above  equation  to  the 
power  39 6/pe,  for  each  maximal  prime  power  pe  which  divides  the  order  of  the  group  396.  Hence, 
we  obtain  the  three  discrete  logarithm  problems 

334  =  /i396/4  =  5396/4^  =  334^  (mod  397), 

286  =  h396/9  =  g396^  =  79X9  (mod  397), 

273  =  h396/n  =  g396/Ux n  =  290*11  (mod  397). 

The  value  of  X4  is  the  value  of  x  modulo  4,  the  value  of  xg  is  the  value  of  x  modulo  9  whilst  the 
value  of  x\\  is  the  value  of  x  modulo  11.  Clearly  if  we  can  determine  these  three  values  then  we 
can  determine  x  modulo  396. 

Determining  X4:  By  inspection  we  see  that  X4  =  1,  but  let  us  labour  the  point  and  show  how  the 
above  algorithm  will  determine  this  for  us.  We  write 

X4  *£4,0  T  2  *  *^4,i  5 

where  X4jo,X4ji  E  {0, 1}.  Recall  that  we  wish  to  solve 

hi  =  334  =  334X4  =  gix\ 

We  set  h2  =  h\  and  g2  =  Q\  and  solve  the  discrete  logarithm  problem 

h2  =  g2X4,0 

in  the  cyclic  group  of  order  two.  We  find,  using  our  oracle  for  the  discrete  logarithm  problem  in 
cyclic  groups,  that  X4p  =  1.  So  we  now  have 

—  =  g2x 4,1  (mod  397). 

g  1 

Hence  we  have  1  =  396X4>1,  which  is  another  discrete  logarithm  in  the  cyclic  group  of  order  two. 
We  find  X4  \  =  0  and,  as  expected, 

X4  X4?Q  +  2  •  X4P  —  1  H-  2  •  0  =  1 . 


Determining  xg:  We  write 

Xg  =  Xgp  +  3  •  Xgq, 

where  xgp,xg^  E  {0, 1,2}.  Recall  that  we  wish  to  solve 

hi  =  286  =  79X9  =  gix 9. 

We  set  h2  =  h\  and  g2  =  g\  and  solve  the  discrete  logarithm  problem 

h2  =  34  =  g2X9’°  =  362X9’° 

in  the  cyclic  group  of  order  three.  We  find,  using  our  oracle  for  the  discrete  logarithm  problem  in 
cyclic  groups,  that  xgp  =  2.  So  we  now  have 

-r  =  g2X 9,1  (mod  397). 

9 1 

Hence  we  have  1  =  362X9’b  which  is  another  discrete  logarithm  in  the  cyclic  group  of  order  three. 
We  find  xgp  =  0  and  so  conclude  that 

Xg  Xgp  +  3  •  Xg^\  =  2  T  3  •  0  =  2. 


3.3.  BABY-STEP/GIANT-STEP  METHOD 
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Determining  x\\:  We  are  already  in  a  cyclic  group  of  prime  order,  so  applying  our  oracle  to  the 
discrete  logarithm  problem 

273  =  290xn  (mod  397), 


we  find  that  x\\  —  6. 


Summary:  So  we  have  determined  that  if 

208  =  5X 


then  x  is  given  by 


(mod  397), 


x  —  1  (mod  4), 
x  —  2  (mod  9), 
x  —  6  (mod  11). 

If  we  apply  the  Chinese  Remainder  Theorem  to  this  set  of  three  simultaneous  equations,  then  we 
obtain  that  the  solution  to  our  discrete  logarithm  problem  is  given  by  x  =  281. 


3.3.  Baby-Step/Giant-Step  Method 


In  our  above  discussion  of  the  Pohlig-Hellman  algorithm  we  assumed  we  had  an  oracle  to  solve 
the  discrete  logarithm  problem  in  cyclic  groups  of  prime  order.  We  shall  now  describe  a  general 
method  to  solve  such  problems  due  to  Shanks  called  the  Baby- Step/ Giant- Step  method.  We  stress 
that  this  is  a  generic  method  which  applies  to  any  cyclic  finite  abelian  group. 

Since  the  intermediate  steps  in  the  Pohlig-Hellman  algorithm  are  quite  simple,  the  difficulty 
of  solving  a  general  discrete  logarithm  problem  will  be  dominated  by  the  time  required  to  solve 
the  discrete  logarithm  problem  in  the  cyclic  subgroups  of  prime  order.  Hence,  for  generic  groups 
the  complexity  of  the  Baby-Step/Giant-Step  method  will  dominate  the  overall  complexity  of  any 
algorithm.  Indeed,  one  can  show  that  the  following  method  is  the  best  possible  method,  time- 
wise,  for  solving  the  discrete  logarithm  problem  in  an  arbitrary  group.  Of  course  in  any  actual 
group  there  may  be  a  special  purpose  algorithm  which  works  faster,  but  in  general  the  following  is 
provably  the  best  one  can  do. 

We  fix  notation  as  follows:  We  have  a  public  cyclic  group  G  —  (g),  which  we  can  now  assume 
to  have  prime  order  p.  We  are  also  given  an  h  G  G  and  are  asked  to  find  the  value  of  x  modulo  p 
such  that 

h  =  gx. 

We  assume  there  is  some  fixed  encoding  of  the  elements  of  G,  so  in  particular  it  is  easy  to  store, 
sort  and  search  a  list  of  elements  of  G. 

The  idea  behind  the  Baby- Step /Giant- Step  method  is  a  standard  divide-and-conquer  approach 
found  in  many  areas  of  computer  science.  We  write 


x  =  xo  +  xi  *  IVpI- 

Now,  since  0  <  x  <  p,  we  have  that  0  <  xq,xi  <  \^/p]-  We  first  compute  the  Baby-Steps 

9i  gl  for  0  <  i  <  \  y/p\  ■ 


The  pairs 


are  stored  in  a  table  so  that  one  can  easily  search  for  items  indexed  by  the  first  entry  in  the  pair. 
This  can  be  accomplished  by  sorting  the  table  on  the  first  entry,  or  more  efficiently  by  the  use  of 
hash  tables.  To  compute  and  store  the  Baby-Steps  clearly  requires 


o(Wp\) 


time  and  a  similar  amount  of  storage. 
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3.  DISCRETE  LOGARITHMS 


We  now  compute  the  Giant-Steps  hj  <—  h-  g~i'Wp\  for  0  <  j  <  [y/p],  and  try  to  find  a  match 
in  the  table  of  Baby-Steps,  i.e.  we  try  to  find  a  value  gi  such  that  gi  =  hj.  If  such  a  match  occurs 
we  have 

xq  =  i  and  x\  —  j, 


since,  if  gi  —  hj ,  we  have 


=  h-g~j- Tv^l, 


i.e. 

gi+j-lVP]  =  h. 

Notice  that  the  time  to  compute  the  Giant-Steps  is  at  most 


Hence,  the  overall  time  and  space  complexity  of  the  Baby-Step/Giant-Step  method  is 

0(y/p). 

This  means,  combining  with  the  Pohlig-Hellman  algorithm,  that  if  we  wish  a  discrete  logarithm 
problem  in  a  group  G  to  be  as  difficult  as  a  work  effort  of  2 128  operations,  then  we  need  the  group 
G  to  have  a  prime  order  subgroup  of  size  larger  than  2256. 


Baby-Step/Giant-Step  Example:  As  an  example  we  take  the  subgroup  of  order  101  in  the 
multiplicative  group  of  the  finite  field  F607,  generated  by  g  =  64.  Suppose  we  are  given  the  discrete 
logarithm  problem 

h  =  182  =  64*  (mod  607). 


We  first  compute  the  Baby-Steps 


9i 


(mod  607)  for  0  <  i  <  f a/  101 


11. 


We  compute 


i 

64z  (mod  607) 

i 

64z  (mod  607) 

0 

1 

6 

330 

1 

64 

7 

482 

2 

454 

8 

498 

3 

527 

9 

308 

4 

343 

10 

288 

5 

100 

Now  we  compute  the  Giant-Steps, 


hj  =  182  •  64  11'-7  (mod  607)  for  0  <  j  <  11, 


and  check  when  we  obtain  a  Giant-Step  which  occurs  in  our  table  of  Baby-Steps: 


3 

182  •  M~ll  j  (mod  607) 

3 

182  •  64-11'-7  (mod  607) 

0 

182 

6 

60 

1 

143 

7 

394 

2 

69 

8 

483 

3 

271 

9 

76 

4 

343 

10 

580 

5 

573 

3.4.  POLLARD-TYPE  METHODS 


59 


So  we  obtain  a  match  when  i  =  4  and  j  =  4,  which  means  that 

x  =  4  +  11  •  4  =  48, 

which  we  can  verify  to  be  the  correct  answer  to  the  earlier  discrete  logarithm  problem  by  computing 
6448  (mod  607)  =  182. 


3.4.  Pollard- Type  Methods 

The  trouble  with  the  Baby-Step/Giant-Step  method  is  that,  although  its  run  time  is  bounded  by 
0(y/p),  it  required  0(y/p)  space.  In  practice  this  space  requirement  is  more  of  a  hindrance  than 
the  time  requirement.  Hence,  one  could  ask  whether  one  could  trade  the  large  space  requirement 
for  a  smaller  space  requirement,  but  still  obtain  a  time  complexity  of  0(^/p)?  Well  we  can,  but  we 
will  now  obtain  only  an  expected  running  time  rather  than  an  absolute  bound  on  the  running  time; 
thus  technically  we  obtain  a  Las  Vegas-style  algorithm  as  opposed  to  a  deterministic  one.  There 
are  a  number  of  algorithms  which  achieve  this  reduced  space  requirement  all  of  which  are  due  to 
ideas  of  Pollard. 

3.4.1.  Pollard’s  Rho  Algorithm:  Suppose  /  :  S  S  is  a  random  mapping  between  a  set  S 
and  itself,  where  the  size  of  S  is  n.  Now  pick  a  random  value  xq  e  S  and  compute 

Xi+ 1  <r-  f(xi)  for  i  >  0. 

We  consider  the  values  #0,24, #2, . . .  as  a  deterministic  random  walk.  By  this  statement  we  mean 
that  each  step  2^+1  =  /(aq)  of  the  walk  is  a  deterministic  function  of  the  current  position  aq,  but 
we  are  assuming  that  the  sequence  24,  aq,  24,  •  •  •  behaves  as  a  random  sequence  would.  Another 
name  for  a  deterministic  random  walk  is  a  pseudo-random  walk. 

The  goal  of  many  of  Pollard’s  algorithms  is  to  find  a  collision  in  a  random  mapping  like  the  one 
above,  where  a  collision  is  finding  a  pair  of  values  aq  and  xj  with  i  7^  j  such  that 

ry>  .  -  ry»  . 

^  j  • 

From  the  birthday  paradox  from  Section  1.4.2,  we  obtain  a  collision  after  an  expected  number  of 

\J 7T  •  nj  2 

iterations  of  the  map  /.  Hence,  finding  a  collision  using  the  birthday  paradox  in  a  naive  way  would 
require  0(y/n)  time  and  0(y/n)  memory.  But  this  large  memory  requirement  is  exactly  the  problem 
with  the  Baby-Step/Giant-Step  method  we  were  trying  to  avoid. 

But,  since  S  if  finite,  we  must  eventually  obtain  aq  =  x3  for  some  values  of  i  and  j,  and  so 

Xi+ 1  =  f(Xi)  =  f(Xj)  =  Xj+ 1. 

Hence,  the  sequence  xq,  aq,  #2,  •  •  • ,  will  eventually  become  cyclic.  If  we  “draw”  such  a  sequence 
then  it  looks  like  the  Greek  letter  rho,  p.  In  other  words  there  is  a  cyclic  part  and  an  initial  tail.  It 
can  be  shown,  using  much  the  same  reasoning  as  for  the  birthday  bound  above,  that  for  a  random 
mapping,  the  tail  has  expected  length  a/t  •  n/ 8,  whilst  the  cycle  also  has  expected  length  yV  •  n/ 8. 
It  is  this  observation  which  will  allow  us  to  reduce  the  memory  requirement  to  constant  space. 

To  find  a  collision  and  make  use  of  the  rho  shape  of  the  random  walk,  we  use  a  technique  called 
Floyd’s  cycle-finding  algorithm:  Given  (24,  24)  we  compute  (24, 24)  and  then  (24, 24)  and  so  on, 
i.e.  given  the  pair  (aq,#2i)  we  compute 

(xi+i,X2i+2)  =  {f(xi),f{f(x2i))). 


%m  — 


We  stop  when  we  find 


24  m* 
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If  the  tail  of  the  sequence  xq,x\,X2,  . . .  has  length  A  and  the  cycle  has  length  fi  then  it  can  be 
shown  that  we  obtain  such  a  value  of  m  when 

m  =  (i  •  (1+  LVmJ)  • 

Since  A<m<A  +  /iwe  see  that 

nn  =  0{\/n), 

and  this  will  be  an  accurate  complexity  estimate  if  the  mapping  /  behaves  suitably  like  a  random 
function.  Hence,  we  can  detect  a  collision  with  virtually  no  storage. 

This  is  all  very  well,  but  we  have  not  shown  how  to  relate  this  to  the  discrete  logarithm  problem. 
Let  G  denote  a  group  of  order  n  and  let  the  discrete  logarithm  problem  be  given  by 

h  =  gx. 

We  partition  the  group  into  three  sets  Si,  S2,  S3,  where  we  assume  1  0  S2,  and  then  define  the 
following  random  walk  on  the  group  G, 

h  •  xi  xi  G  Si, 

Xi+ 1  <-  f(xi)  =  {  xf  Xi  e  S2, 

g  ■  xi  Xi  €  S3. 

The  condition  that  1  0  S2  is  to  ensure  that  the  function  /  has  no  stationary  points.  In  practice  we 
actually  keep  track  of  three  pieces  of  information 


(xi,di,bi)  GGxZxZ 


where 


Ri+1 


and 


bi+i 


CLi 

Xi  G  Si, 

2  •  ci{ 

(mod  n) 

Ti  £  S2, 

0?  + 1 

(mod  n) 

G  S3, 

bj  +  1 

(mod  n) 

G  Si , 

2-fti 

(mod  n) 

Xi  £  S2, 

bi 

Xi  G  S3. 

If  we  start  with  the  triple 


then  we  have,  for  all  i 


(x0,a0,b0)  =  (1,0,0) 


logff(xi)  =  ai  +  bi-  log g(h)  =  cii  +  bi  ■  x. 

Applying  Floyd’s  cycle-finding  algorithm  we  obtain  a  collision,  and  so  find  a  value  of  to  such  that 

Xm  —  m- 

This  leads  us  to  deduce  the  following  equality  of  discrete  logarithms 

Rm  T  '  X  =  R-m  T  '  log g{]^) 

=  lo  gg(Xm) 

=  logg(x2m) 

—  R-2 m  T  b‘2rri  ’  ^g^(h) 

—  R-2 m  T  ^2 m  ‘  X. 

Rearranging  we  see  that 

(pm  ^2 m)  '  X  =  R-2 m 
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and  so,  if  bm  7^  62m ,  we  obtain 


x  = 


CL2m  Pm 


(mod  n). 


The  probability  that  we  have  bm  =  b^m  is  small  enough  to  be  ignored  for  large  n.  Thus  the  above 
algorithm  will  find  the  discrete  logarithm  in  expected  time  0(y/n). 


Pollard’s  Rho  Example:  As  an  example  consider  the  subgroup  G  of  Fg07  of  order  n  =  101 
generated  by  the  element  g  =  64  and  the  discrete  logarithm  problem 

h=  122  =  64*. 


We  define  the  sets  Si,  S2,  S 3  as  follows: 

51  =  {x  e  Fg07  :  x  <  201}, 

52  =  {x  e  Fg07  :  202  <  x  <  403}, 

53  =  {x  G  Fg07  :  404  <  x  <  606}. 

Applying  Pollard’s  Rho  method  we  obtain  the  following  data 


i 

Xi 

cii 

bi 

X2i 

«2  i 

b2i 

0 

1 

0 

0 

1 

0 

0 

1 

122 

0 

1 

316 

0 

2 

2 

316 

0 

2 

172 

0 

8 

3 

308 

0 

4 

137 

0 

18 

4 

172 

0 

8 

7 

0 

38 

5 

346 

0 

9 

309 

0 

78 

6 

137 

0 

18 

352 

0 

56 

7 

325 

0 

19 

167 

0 

12 

8 

7 

0 

38 

498 

0 

26 

9 

247 

0 

39 

172 

2 

52 

10 

309 

0 

78 

137 

4 

5 

11 

182 

0 

55 

7 

8 

12 

12 

352 

0 

56 

309 

16 

26 

13 

76 

0 

11 

352 

32 

53 

14 

167 

0 

12 

167 

64 

6 

So  we  obtain  a  collision,  using  Floyd’s  cycle-finding  algorithm,  when  m  =  14.  We  see  that 

g°  ■  h12  =  g64  ■  h 6 


which  implies 


12  •  x  =  64  +  6  •  x  (mod  101). 


Consequently 

64 

x  = 


12-6 


(mod  101)  =  78. 


62 


3.  DISCRETE  LOGARITHMS 


3.4.2.  Pollard’s  Lambda  Method:  Pollard’s  Lambda  method  is  like  the  Rho  method,  in  that 
we  use  a  deterministic  random  walk  and  a  small  amount  of  storage  to  solve  the  discrete  logarithm 
problem.  However,  the  Lambda  method  is  particularly  tuned  to  the  situation  where  we  know  that 
the  discrete  logarithm  lies  in  a  certain  interval 


x  G 


a. 


b] 


In  the  Rho  method  we  used  one  random  walk,  which  turned  into  the  shape  of  the  Greek  letter 
p,  whilst  in  the  Lambda  method  we  use  two  walks  which  end  up  in  the  shape  of  the  Greek  letter 
lambda,  i.e.  A,  hence  giving  the  method  its  name.  Another  name  for  this  method  is  Pollard’s 
Kangaroo  method  as  it  was  originally  described  with  the  two  walks  being  performed  by  kangaroos. 

Let  w  =  b  —  a  denote  the  length  of  the  interval  in  which  the  discrete  logarithm  x  is  known  to 
he.  We  define  a  set 

S  {sq,  •  •  •  •>  $k—  l} 


of  integers  in  non-decreasing  order, 
common  to  choose 


The  mean  m  of  the  set  should  be  around  N 

Si  =  2l  for  0  <  i  <  k. 


It  is 


which  implies  that  the  mean  of  the  set  is 

2k 

¥’ 

and  so  we  choose 

1 

k  «  -  •  log2(w). 

We  divide  the  group  up  into  k  sets  Si,  for  i  =  0, . . . ,  k  —  1,  and  define  the  following  deterministic 
random  walk: 


%i+ 1  =  Xi  •  gs°  if  Xi  G  Sj. 

We  first  compute  the  deterministic  random  walk  starting  from  the  end  of  the  interval  go  =  gh  by 
setting 

9i+ 1  =  9i  •  9s j  if  9i  €  Sj, 

for  i  =  1, ...  ,N  =  l^/w\ .  We  also  set  Co  =  b  and  i  =  q  +  Sj  (mod  q).  We  store  pw  and  notice 
that  we  have  computed  the  discrete  logarithm  of  pw  with  respect  to  p, 


Cn  =  log^(pw). 

We  now  compute  our  second  deterministic  random  walk,  starting  from  the  point  in  the  interval 
corresponding  to  the  unknown  x\  we  set  ho  =  h  —  gx  and  compute 


hi+i  =  hi  •  gs>  if  hi  G  Sj. 

We  also  set  do  =  0  and  di+i  =  di  +  Sj  (mod  p).  Notice  that  we  have 

lo  gg(hi)  =  x  +  di. 

If  the  path  of  the  hi  ever  meets  that  of  the  path  of  the  pi  then  the  hi  will  carry  on  the  path  of  the 
pi,  and  so  eventually  reach  the  point  gjy.  Thus,  we  are  able  to  find  a  value  M  where  hM  equals  our 
stored  point  gjy.  We  then  have 

cN  =  log  g(gN)  =  log  g(hM)  =  X  +  dM, 

and  so  the  solution  to  our  discrete  logarithm  problem  is  given  by 

x  =  cn  —  dM  (mod  q). 

If  we  do  not  get  a  collision  then  we  can  increase  N  and  continue  both  walks  in  a  similar  manner 
until  a  collision  does  occur. 

The  expected  running  time  of  this  method  is  and  again  the  storage  can  be  seen  to  be 
constant.  The  Lambda  method  can  be  used  when  the  discrete  logarithm  is  only  known  to  he  in 
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the  full  interval  [0 , ...  ,q  —  1].  But  in  this  situation,  whilst  the  asymptotic  complexity  is  the  same 
as  the  Rho  method,  the  Rho  method  is  better  due  to  the  implied  constants. 

Pollard’s  Lambda  Example:  As  an  example  we  again  consider  the  subgroup  G  of  Fg07  of  order 
in  —  101  generated  by  the  element  g  =  64,  but  now  we  look  at  the  discrete  logarithm  problem 

h  =  524  =  64* . 

We  are  given  that  the  discrete  logarithm  x  lies  in  the  interval  [60, . . . ,  80].  As  our  set  of  multipliers 
si  we  take  si  =  2?  for  i  =  0, 1,  2,  3.  The  subsets  So,  •  •  • ,  $3  of  G  we  define  by 

Si  =  {g  E  G  :  g  (mod  4)  =  i}. 

We  first  compute  the  deterministic  random  walk  gi  and  the  discrete  logarithms  q  =  \ogg(gi),  for 
i  =  0, . . . ,  N  =  La/80  -  40J  =  4. 


i 

9i 

Ci 

0 

151 

80 

1 

537 

88 

2 

391 

90 

3 

478 

98 

4 

64 

1 

Now  we  compute  the  second  deterministic  random  walk 


i 

hi 

di  =  log g(hi)  -  X 

0 

524 

0 

1 

151 

1 

2 

537 

9 

3 

391 

11 

4 

478 

19 

5 

64 

23 

Hence,  we  obtain  the  collision  h§  =  g 4  and  so 

x  =  1  -  23  (mod  101)  =  79. 

Note  that  examining  the  above  tables  we  see  that  we  had  earlier  collisions  between  our  two  walks. 
However,  we  are  unable  to  use  these  since  we  do  not  store  go,  g\,  g 2  or  g%.  We  have  only  stored  the 
value  of  ^4. 

3.4.3.  Parallel  Pollard’s  Rho:  In  real  life  when  we  use  random- walk-based  techniques  to  solve 
discrete  logarithm  problems  we  use  a  parallel  version,  to  exploit  the  computing  resources  of  a 
number  of  sites  across  the  Internet.  Suppose  we  are  given  the  discrete  logarithm  problem 

h  =  gx 

in  a  group  G  of  prime  order  q.  We  first  decide  on  an  easily  computable  function 

H  :  G  — >  {1, . . . ,  k}, 

where  k  is  usually  around  20.  Then  we  define  a  set  of  multipliers  rrq.  These  are  produced  by 
generating  random  integers  cq,  ^  E  [0, . . . ,  q  —  1]  and  then  setting 

ml  =  gai  •  hbi. 

To  start  a  deterministic  random  walk  we  pick  random  so,  to  E  [0, ...  ,q  —  1}  and  compute 

So  =  9S0  ■  hto , 
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the  deterministic  random  walk  is  then  defined  on  the  triples  (gi,Si,ti)  where 

9i+ 1  =  9i  •  mH(gx), 

Sj+1  =  Si  +  aH(g.)  (mod  q), 

U+i  =ti  +  bH(gi)  (mod  q). 

Hence,  for  every  gi  we  record  the  values  of  Si  and  t{  such  that 

9i  =  gSi  ■  hu . 

Suppose  we  have  m  processors:  each  processor  starts  a  different  deterministic  random  walk  from 
a  different  starting  position  using  the  same  algorithm  to  determine  the  next  element  in  the  walk. 
When  another  processor,  or  even  the  same  processor,  meets  an  element  of  the  group  that  has  been 
seen  before  then  we  obtain  an  equation 

gSi  .  gU  _  gSj  m  ytj 

which  we  can  solve  for  the  discrete  logarithm  x.  Hence,  we  expect  that  after  •  q)/(2  •  m2)) 

iterations  of  these  parallel  walks  we  will  find  a  collision  and  so  solve  the  discrete  logarithm  problem. 

However,  as  described  this  means  that  each  processor  needs  to  return  every  element  in  its  com¬ 
puted  deterministic  random  walk  to  a  central  server  which  then  stores  all  the  computed  elements. 
This  is  highly  inefficient  as  the  storage  requirements  will  be  very  large,  namely  0(y/ (n  •  q)/ 2).  We 
can  reduce  the  storage  to  any  required  value  as  follows:  We  first  define  a  function  d  on  the  group 

d  :  G  — »  {0, 1} 

such  that  d(g)  =  1  around  1/2  of  the  time  for  any  g  G  G.  The  function  d  is  often  defined  by 
returning  d(g)  =  1  if  a  certain  subset  of  t  of  the  bits  representing  the  group  element  g  G  G  are  set 
to  zero  for  example.  The  elements  in  G  for  which  d(g)  =  1  will  be  called  distinguished. 

ft  is  only  the  distinguished  group  elements  that  are  now  transmitted  back  to  the  central  server. 
This  means  that  we  expect  the  deterministic  random  walks  to  need  to  continue  another  2  steps 
before  a  collision  is  detected  between  two  deterministic  random  walks.  Hence,  the  computing  time 
now  becomes 

o  (Vc  •  q)/(2  •  w2)/  +  2‘)  , 

whilst  the  storage  becomes 

O  (a/ (tt  •  q)/22-t+1^j  . 

This  allows  the  storage  to  be  reduced  to  any  manageable  amount,  at  the  expense  of  a  little  extra 
computation.  We  do  not  give  an  example,  since  the  method  only  really  becomes  useful  as  q  becomes 
large  (say  q  >  220). 


3.5.  Sub-exponential  Methods  for  Finite  Fields 

There  is  a  close  relationship  between  the  sub-exponential  methods  for  factoring  and  the  sub¬ 
exponential  methods  for  solving  the  discrete  logarithm  problem  in  finite  fields.  We  shall  only 
consider  the  case  of  prime  fields  Fp  but  similar  considerations  apply  to  finite  fields  of  small  char¬ 
acteristic;  here  we  use  a  special  algorithm  called  the  Function  Field  Sieve.  The  sub-exponential 
algorithms  for  finite  fields  are  often  referred  to  as  index-calculus  algorithms,  as  an  index  is  an  old 
name  for  a  discrete  logarithm. 

We  assume  we  are  given  g,h  G  F*,  such  that 

h  =  gx. 
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We  choose  a  factorbase  F  of  elements,  usually  small  prime  numbers,  and  then,  using  one  of  the 
sieving  strategies  used  for  factoring,  we  obtain  a  large  number  of  relations  of  the  form 

p ^  =  1  (mod  p). 

Pi^F 

These  relations  translate  into  the  following  equations  for  discrete  logarithms, 

]T  ei  ■  log,  fa)  =  0  (mod  p  —  1). 

pteF 

Once  enough  equations  like  the  one  above  have  been  found  we  can  solve  for  the  discrete  logarithm 
of  every  element  in  the  factorbase,  i.e.  we  can  determine 


xi  =  log  giPi)- 

The  value  of  X{  is  sometimes  called  the  index  of  pi  with  respect  to  g.  This  calculation  is  performed 
using  linear  algebra  modulo  p  —  1,  which  is  more  complicated  than  the  linear  algebra  modulo  two 
performed  in  factoring  algorithms.  However  similar  tricks,  to  those  deployed  in  the  linear  algebra 
stage  of  factoring  algorithms  can  be  deployed  to  keep  storage  requirements  down  to  manageable 
levels. 

This  linear  algebra  calculation  only  needs  to  be  done  once  for  each  generator  g,  and  the  results 
can  then  be  used  for  many  values  of  h.  When  we  wish  to  solve  a  particular  discrete  logarithm 
problem  h  =  gx ,  we  use  a  sieving  technique,  or  simple  trial  and  error,  to  write 

h  =  V (mod  p), 

Pi^F 


e.g.  we  could  compute 

T 

and  see  whether  it  factors  in  the  form 


h  •  p{z  (mod  p) 

Pi£F 

t=  n  pf- 

Pi£F 


If  it  does  then  we  have 


h  =  pi9i  h  (mod  p). 

Pi£F 


We  can  then  compute  the  discrete  logarithm  x  from 


X  =  log  g(h)  =  lOgg  I  P[  pf 

\p%eF 

=  ^2  hi-  log  g(pi)  (mod  p-  1) 

Pi£F 

—  hi  •  xi  (mod  p  —  1). 

Pi£F 

This  means  that,  once  one  discrete  logarithm  has  been  found,  determining  the  next  one  is  easier 
since  we  have  already  computed  the  values  of  the  ay. 

The  best  of  the  methods  to  find  the  relations  between  the  factorbase  elements  is  the  Number 
Field  Sieve.  This  gives  an  overall  running  time  of  0(LP(  1/3,  c))  for  some  constant  c.  This  is  roughly 
the  same  complexity  as  the  algorithms  to  factor  large  numbers,  although  the  real  practical  problem 
is  that  the  matrix  algorithms  now  need  to  work  modulo  p  —  1  and  not  modulo  2  as  they  did  in  the 
factoring  algorithms. 
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The  upshot  of  these  sub-exponential  methods  is  that  the  size  of  p  for  finite  held  discrete- 
logarithm-based  systems  needs  to  be  of  the  same  order  of  magnitude  as  a  factoring  modulus,  i.e. 
p  >  22048.  Even  though  p  has  to  be  very  large  we  still  need  to  guard  against  generic  attacks,  hence 
p  —  1  should  have  a  prime  factor  q  of  order  greater  than  2256.  In  fact,  for  finite- field-based  systems 
we  usually  work  in  the  subgroup  of  F*  of  order  q. 

In  2013,  Antoine  Joux  and  others  showed  that  discrete  logarithms  in  finite  fields  of  characteristic 
two  can  be  determined  in  quasi-polynomial  time.  This  result  is  striking,  but  the  methods  used  do 
not  seem  to  generalize  to  higher-characteristic  fields.  This  confirms  the  long-standing  belief  that 
discrete- logarithm-based  cryptographic  systems  in  fields  of  low  characteristic  should  be  avoided. 


Chapter  Summary 


•  We  covered  the  DLP,  DHP  and  DDH  problems  and  the  relationships  between  them. 

•  Due  to  the  Pohlig-Hellman  algorithm  a  hard  discrete  logarithm  problem  should  be  set  in 
a  group  where  the  order  has  a  large  prime  factor. 

•  Generic  algorithms  such  as  the  Baby-Step/Giant-Step  algorithm  mean  that  to  achieve  the 
same  security  as  a  128-bit  block  cipher,  the  size  of  the  large  prime  factor  of  the  group 
order  should  be  at  least  256  bits. 

•  The  Baby-Step/Giant-Step  algorithm  is  a  generic  algorithm  and  its  running  time  can  be 
absolutely  bounded  by  0(y/q),  where  q  is  the  size  of  the  large  prime  factor  of  ifG.  However, 
the  storage  requirements  of  the  Baby-Step/Giant-Step  algorithm  are  also  0(^/q). 

•  There  are  a  number  of  techniques,  due  to  Pollard,  based  on  deterministic  random  walks 
in  a  group.  These  are  generic  algorithms  which  require  little  storage  but  which  solve  the 
discrete  logarithm  problem  in  expected  time  O(Mq). 

•  For  finite  fields  a  number  of  index  calculus  algorithms  exist  which  run  in  sub-exponential 
time.  These  mean  that  one  needs  to  take  large  finite  fields  F pt  with  pf  >  22048  to  obtain 
a  hard  discrete  logarithm  problem.  Due  to  the  new  quasi-polynomial-time  attacks  on 
low-characteristic  fields  we  should  select  the  characteristic  to  not  be  “too  small” . 


Further  Reading 

There  are  a  number  of  good  surveys  on  the  discrete  logarithm  problem.  I  would  recommend 
the  ones  by  McCurley  and  Odlyzko.  For  a  more  modern  perspective  see  the  article  by  Odlyzko, 
Pierrot  and  Joux. 

K.  McCurley.  The  discrete  logarithm  problem.  In  Cryptology  and  Computational  Number  Theory , 
Proc.  Symposia  in  Applied  Maths,  Volume  42,  47-94,  AMS,  1990. 

A.  Odlyzko.  Discrete  logarithms:  The  past  and  the  future.  Designs,  Codes  and  Cryptography,  19, 
129-145,  2000. 

A.  Odlyzko,  C.  Pierrot  and  A.  Joux.  The  past,  evolving  present,  and  future  of  the  discrete  logarithm. 
In  Open  Problems  in  Mathematics  and  Computational  Science ,  Springer,  2014. 


CHAPTER  4 


Elliptic  Curves 


Chapter  Goals 

•  To  describe  what  an  elliptic  curve  is. 

•  To  explain  the  basic  mathematics  behind  elliptic  curve  cryptography. 

•  To  show  how  projective  coordinates  can  be  used  to  improve  computational  efficiency. 

•  To  show  how  point  compression  can  be  used  to  improve  communications  efficiency. 

4.1.  Introduction 

This  chapter  is  devoted  to  introducing  elliptic  curves.  Some  of  the  more  modern  public  key  systems 
make  use  of  elliptic  curves  since  they  can  offer  improved  efficiency  and  bandwidth.  Since  much  of 
this  book  can  be  read  with  just  the  understanding  that  an  elliptic  curve  provides  another  finite 
abelian  group  in  which  one  can  pose  a  discrete  logarithm  problem,  you  may  decide  to  skip  this 
chapter  on  an  initial  reading. 

Let  K  be  any  held.  The  projective  plane  P2(X)  over  K  is  defined  as  the  set  of  triples 

(A,  Y,  Z) 

where  X,  Y,  Z  £  K  are  not  all  simultaneously  zero.  On  these  triples  is  defined  an  equivalence 
relation 

(X,Y,Z)  =  (X1,Y1,Z1) 

if  there  exists  a  A  £  K  such  that 

X  =  A  •  Xi,  Y  =  A  •  Yi  and  Z  =  A  •  Zx. 

So,  for  example,  if  K  —  F7,  the  finite  held  of  seven  elements,  then  the  two  points 

(4,1,1)  and  (5,3,3) 

are  equivalent.  Such  a  triple  is  called  a  projective  point. 

An  elliptic  curve  over  K  will  be  dehned  as  the  set  of  solutions  in  the  projective  plane  P2(X)  of 
a  homogeneous  Weierstrass  equation  of  the  form 

E  :  Y2  ■  Z +  ai- X  -Y  ■  Z +  a3-Y  ■  Z2  =  X3  +a2- X2  ■  Z +  a4- X  ■  Z2 +  a6- Z3, 

with  ai,  <22,  <23,  <24,  aQ  £  K.  This  equation  is  also  referred  to  as  the  long  Weierstrass  form.  Such  a 
curve  should  be  non-singular  in  the  sense  that,  if  the  equation  is  written  in  the  form  F(X,  Y,  Z)  =  0, 
then  the  partial  derivatives  of  the  curve  equation 

dF/dX ,  dF/dY  and  dF/dZ 

should  not  vanish  simultaneously  at  any  point  on  the  curve,  i.e.  the  three  simultaneous  equations 
have  no  zero  dehned  over  the  algebraic  closure  K. 

The  set  of  X-rational  points  on  X,  i.e.  the  solutions  in  P2(X)  to  the  above  equation,  is  denoted 
by  E{K).  Notice  that  the  curve  has  exactly  one  rational  point  with  coordinate  Z  equal  to  zero, 
namely  (0, 1,0).  This  is  called  the  point  at  inhnity,  which  will  be  denoted  by  O . 
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4.1.1.  The  Affine  Form:  For  convenience,  we  will  most  often  use  the  affine  version  of  the  Weier- 
strass  equation,  given  by 

(5)  E  i  Y2  T  a\  •  A  •  Y  T  (X3  •  Y  =  A3  cl2  *  X  ^  T  0,4  •  X  T  ng, 

where  aj  G  if;  this  is  obtained  by  setting  Z  =  1  in  the  above  equation.  The  A-rational  points  in  the 
affine  case  are  the  solutions  to  E  in  A2,  plus  the  point  at  infinity  O.  Although  most  protocols  for 
elliptic-curve-based  cryptography  make  use  of  the  affine  form  of  a  curve,  it  is  often  computationally 
important  to  be  able  to  switch  to  projective  coordinates.  Luckily  this  switch  is  easy: 

•  The  point  at  infinity  always  maps  to  the  point  at  infinity  in  either  direction. 

•  To  map  a  projective  point  (A,  T,  Z)  which  is  not  at  infinity,  so  Z  7^  0,  to  an  affine  point 
we  simply  compute  (A/Z,  A/Z). 

•  To  map  an  affine  point  (A,  A),  which  is  not  at  infinity,  to  a  projective  point  we  take  a 
random  non-zero  Z  E  K  and  compute  (A  •  Z,  Y  •  Z,  Z). 

As  we  shall  see  later  it  is  often  more  convenient  to  use  a  slightly  modified  form  of  projective  point 
where  the  projective  point  (A,  A,  Z)  represents  the  affine  point  (A/Z2,A/Z3),  which  equates  to 
using  the  projective  equation 

E  :  Y2  +  ai  •  A  •  Y  •  Z  +  a3  •  Y  •  Z3  =  A3  +  a2  •  A2  •  Z2  +  a4  •  A  •  Z4  +  a6  •  Z6. 

4.1.2.  Isomorphisms  of  Elliptic  Curves:  Given  an  elliptic  curve  defined  by  equation  (5),  it  is 
useful  to  define  the  following  constants  for  use  in  later  formulae: 

b2  =  +  4  •  a2, 

64  =  cl\  '  dg  T  2  •  a 4:-) 

^6  =  a3  +  4  •  <26, 

2  2  2 
bg  —  oq  •  CLQ  T  4  •  a 2  •  clq  —  cl\  •  n3  •  (14.  T  U2  *  R3  —  R4? 

c4  =  b2  —  24  •  64, 

eg  =  —  b\  +  36  •  b2  •  &4  —  216  •  b§. 

The  discriminant  of  the  curve  is  defined  as 

A  =  —£>2  •  bg  —  8  •  b\  —  27  •  £>g  +  9  •  b2  •  64  •  b§. 

When  the  characteristic  of  the  field  char  A  7^  2,  3  the  discriminant  can  also  be  expressed  as 

A  =  {cl-  cg)/1728. 

Notice  that  1728  =  26  •  33  so,  if  the  characteristic  of  the  underlying  finite  field  is  not  equal  to  2  or 
3,  dividing  by  this  latter  quantity  makes  sense.  A  curve  is  then  non-singular  if  and  only  if  A  7^  0; 
from  now  on  we  shall  assume  that  A  /  0  in  all  our  discussions.  When  A  /  0,  the  j-invariant  of 
the  curve  is  defined  as 

j{E)  =  4/ A. 

As  an  example,  which  we  shall  use  throughout  this  chapter,  we  consider  the  elliptic  curve 

E  :  Y2  =X3  +  X  +  3 

defined  over  the  field  F7.  Computing  the  various  quantities  above  we  find  that  we  have 

A  =  3  and  j(E)  =  5. 

The  j-invariant  is  closely  related  to  the  notion  of  elliptic  curve  isomorphism.  Two  elliptic  curves 
defined  by  Weierstrass  equations  E  (with  variables  A,  A)  and  E\  (with  variables  Ai, Ifi)  are  iso¬ 
morphic  over  K  if  and  only  if  there  exist  constants  r,  s,  t  E  K  and  u  E  A*,  such  that  the  change  of 
variables 

X  =  u2  ■Xl+r  ,  Y  =  u3  ■  Yi  +  s  ■  u2  ■  Xi  + 1 
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transforms  E  into  E\.  This  transformation  might  look  special,  but  arises  from  the  requirements  that 
the  isomorphism  should  map  the  two  points  at  infinity  to  themselves,  and  should  keep  unchanged 
the  structure  of  the  affine  equation  (5). 

Such  an  isomorphism  defines  a  bijection  between  the  set  of  rational  points  in  E  and  the  set  of 
rational  points  in  E\.  Notice  that  isomorphism  is  defined  relative  to  the  held  K.  As  an  example 
consider  again  the  elliptic  curve 

E:  Y2  =  X3  +  X  +  3 

over  the  field  F7.  Now  make  the  change  of  variables  defined  by  [u,  r,  s,  t]  =  [2,  3, 4, 5],  i.e. 

X  =  4  •  X\  +  3  and  Y  =  Y\  +  2  •  X\  +  5. 

We  then  obtain  the  isomorphic  curve 

Ex:  Y?  +  4-X1-Y1  +  3-Y1  =  Xl  +  X1  +  1, 

and  we  have 

j(E)=j(E1)  =  5. 

Curve  isomorphism  is  an  equivalence  relation.  The  following  lemma  establishes  the  fact  that,  over 
the  algebraic  closure  K ,  the  j-invariant  characterizes  the  equivalence  classes  in  this  relation. 

Lemma  4.1.  Two  elliptic  curves  that  are  isomorphic  over  K  have  the  same  j -invariant.  Con¬ 
versely,  two  curves  with  the  same  j -invariant  are  isomorphic  over  the  algebraic  closure  K . 

But  curves  with  the  same  j-invariant  may  not  necessarily  be  isomorphic  over  the  ground  held.  For 
example,  consider  the  elliptic  curve,  also  over  F7, 

E2  :  K?  =  X$  +  4-X2  +  4. 

This  has  j-invariant  equal  to  5  so  it  is  isomorphic  to  E  over  F7,  but  it  is  not  isomorphic  over  F7 
since  the  change  of  variable  required  is  given  by 

A  =  3  •  A2  and  Y  =  •  Y2. 

However,  \/6  0  F7.  Hence,  we  say  both  E  and  E 2  are  dehned  over  F7,  but  they  are  isomorphic  over 

F72  =  F7[\/6]  c  F7. 


4.2.  The  Group  Law 


Assume,  for  the  moment,  that  cha rK  7^  2,3,  and  consider  the  change  of  variables  given  by 

b2 


X  —  X\  — 


Y  =  Yi 


12 

a\ 


X\ 


b‘i 

12 


R3 


This  change  of  variables  transforms  the  long  Weierstrass  form  given  in  equation  (5)  to  the  equation 
of  an  isomorphic  curve  given  in  short  Weierstrass  form, 

E  :  Y2  =  X3+a-X  +  b, 


for  some  a,b  E  K .  One  can  then  dehne  a  group  law  on  an  elliptic  curve  using  the  chord-tangent 
process. 

The  chord  process  is  dehned  as  follows;  see  Figure  4.1  for  a  diagrammatic  description.  Let  P 
and  Q  be  two  distinct  points  on  E.  The  straight  line  joining  P  and  Q  must  intersect  the  curve  at 
one  further  point,  say  R ,  since  we  are  intersecting  a  line  with  a  cubic  curve.  The  point  R  will  also 
be  dehned  over  the  same  held  of  dehnition  as  the  curve  and  the  two  points  P  and  Q.  If  we  then 
rehect  R  in  the  x-axis  we  obtain  another  point  over  the  same  held  which  we  shall  call  P  +  Q. 

The  tangent  process  is  given  diagrammatically  in  Figure  4.2  or  as  follows,  for  a  point  P  on  the 
curve  E.  We  take  the  tangent  to  the  curve  at  P;  such  a  line  must  intersect  E  in  at  most  one  other 
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Figure  4.1.  Adding  two  points  on  an  elliptic  curve 

point,  say  R ,  as  the  elliptic  curve  E  is  defined  by  a  cubic  equation.  Again  we  reflect  R  in  the  x-axis 
to  obtain  a  point  which  we  call  [2\P  =  P  +  P.  If  the  tangent  to  the  point  is  vertical,  it  “intersects” 
the  curve  at  the  point  at  infinity  and  P  +  P  =  (9,  and  P  is  said  to  be  a  point  of  order  2. 

One  can  show  that  the  chord-tangent  process  turns  E  into  an  abelian  group  with  the  point  at 
infinity  O  being  the  identity.  The  above  definition  can  easily  be  extended  to  the  long  Weierstrass 
form  (and  so  to  characteristic  two  and  three).  One  simply  changes  the  definition  by  replacing 
“reflection  in  the  x-axis”  by  “reflection  in  the  line  Y  =  a\  •  X  +  <23”.  In  addition  a  little  calculus  will 
result  in  explicit  algebraic  formulae  for  the  chord-tangent  process.  This  is  necessary  since  drawing 
diagrams  as  above  is  not  really  allowed  in  a  held  of  finite  characteristic.  The  algebraic  formulae 
are  summarized  in  the  following  lemma. 

Lemma  4.2.  Let  E  denote  an  elliptic  curve  given  by 

E  \  Y^  T  a\  •  X  •  Y  T  (23  •  Y  =  X^  T  02  •  X ^  T  0,4  •  X  T  clq 
and  let  P\  =  (xi,yi)  and  P2  =  (x2,^2)  denote  points  on  the  curve.  Then 

-Pi  =  (xi,  -yi  -  ai  •  xi  -  as). 

Set 

A  =  y2  -  yi 

X2  —  Xi  ’ 

2/1  -  ^2  —  2/2  -  xi 
T  =  - 


X2  —  Xi 
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Figure  4.2.  Doubling  a  point  on  an  elliptic  curve 


when  x\  7^  X2,  and  set 


A 

H 

when  x\  —  and  P2  7^  —P\.  If 


3  •  x\  +  2  •  a2  •  x\  +  <24  —  a\  •  y\ 
2  •  yi  +  a\  •  xi  +  <23 
— x^5  +  a4  •  xi  +  2  •  <26  —  as  •  y\ 

2  •  2/1  +  an  •  +  <23 


^3  =  (#3>  2/3)  =  P\+P<li^O 
then  X3  and  y%  are  given  by  the  formulae 

2 

X3  =  A  +  Ui  •  A  —  02  —  X\  —  X2, 
2/3  =  -(A  +  ai)  •  X3  -  /z  -  a3. 


The  elliptic  curve  isomorphisms  described  earlier  then  become  group  isomorphisms  as  they  respect 
the  group  structure. 


4.2.1.  The  Elliptic  Curve  Discrete  Logarithm  Problem  (ECDLP):  For  a  positive  integer 
m  we  let  [m\  denote  the  multiplication-by-m  map  from  the  curve  to  itself.  This  map  takes  a  point 
P  to 

P  +  P  +  •  •  •  +  P, 


where  we  have  m  summands.  This  map  is  the  basis  of  elliptic  curve  cryptography,  since  whilst  it 
is  easy  to  compute,  it  is  believed  to  be  hard  to  invert,  i.e.  given  P  =  (x,  y)  and  [m\P  =  (x7,  y')  it  is 
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hard  to  compute  m.  Of  course  this  statement  of  hardness  assumes  a  well-chosen  elliptic  curve  etc., 
something  we  will  return  to  later. 

Example:  We  end  this  section  with  an  example  of  the  elliptic  curve  group  law.  Again  we  take  our 
elliptic  curve 

E  :  Y2  =X3  +  X  +  3 

over  the  held  F7.  It  turns  out  there  are  six  points  on  this  curve  given  by 

O,  (4,1),  (6,6),  (5,0),  (6,1)  and  (4,6). 

These  form  a  group  with  the  group  law  being  given  by  the  following  table,  which  is  computed  using 
the  addition  formulae  given  above. 


As  an  example  of  the  multiplication- by-m  map,  if  we  let  P  =  (4, 1)  then  we  have 

[2]P  =  (6, 6),  [3]P  =  (5, 0),  [4]P  =  (6, 1),  [5]P=(4,6),  [6 ]P  =  0. 

So  we  see  in  this  example  that  E( F7)  is  a  finite  cyclic  abelian  group  of  order  six  generated  by  the 
point  P.  For  all  elliptic  curves  over  finite  fields  the  group  is  always  finite  and  it  is  also  highly  likely 
to  be  cyclic  (or  “nearly”  cyclic). 


4.3.  Elliptic  Curves  over  Finite  Fields 


Over  a  finite  held  ¥q,  the  number  of  rational  points  on  a  curve  is  hnite,  and  its  size  will  be  denoted 
by  #E(Fq).  The  expected  number  of  points  on  the  curve  is  around  q  +  1  and  if  we  set 

#E(Fq)  =  q  +  1  —  t 

then  the  value  t  is  called  the  trace  of  Frobenius  at  q.  A  hrst  approximation  to  the  order  of  E( ¥q) 
is  given  by  the  following  well-known  theorem  of  Hasse. 

Theorem  4.3  (H.  Hasse,  1933).  The  trace  of  Frobenius  satisfies 


t 


<2  -  y/q. 


Consider  our  example  of 

E  :  Y2  =  X3  +  X +  3 

then  recall  this  has  six  points  over  the  held  F7,  and  so  the  associated  trace  of  Frobenius  is  equal  to 
2,  which  is  less  than  2  •  yjq  =  2  •  \fl  —  5.29. 

The  gth-power  Frobenius  map,  on  an  elliptic  curve  E  dehned  over  Fg,  is  given  by 

f  E(¥q)  — ►  E(¥q) 

Lp  .l  (x,  y)  1 — >(xq,yq) 

[  O^O. 

The  map  Lp  sends  points  on  E  to  points  on  E,  no  matter  what  the  held  of  dehnition  of  the  point 
is.  In  addition  the  map  ip  respects  the  group  law  in  that 


v(p  +  Q)  =  <p(p)  +  e(Q)- 
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In  other  words  the  map  ^  is  a  group  endomorphism  of  E  over  the  algebraic  closure  of  Fg,  which 
is  denoted  by  Fg,  referred  to  as  the  Frobenius  endomorphism.  The  trace  of  Frobenius  t  and  the 
Frobenius  endomorphism  ip  are  linked  by  the  equation 

y  -  [t]ip+[q\  =  [0]. 

Hence,  for  any  point  P  —  (x,y)  on  the  curve,  we  have 

{xq2,yq2)  -  [t\[xq,yq)  +  [q](x,y)  =  O, 
where  addition  and  subtraction  denote  curve  operations. 


As  was  apparent  from  the  earlier  discussion,  the  cases  char  K  —  2,3  often  require  separate 
treatment.  Practical  implementations  of  elliptic  curve  cryptosystems  are  usually  based  on  either 
F2^,  i.e.  characteristic  two,  or  Fp  for  large  primes  p.  Therefore,  in  the  remainder  of  this  chapter  we 
will  focus  on  fields  of  characteristic  two  and  p  >  3,  and  will  omit  the  separate  treatment  of  the  case 
charA  =  3.  Most  arguments,  though,  carry  across  easily  to  characteristic  three,  with  modifications 
that  are  well  documented  in  the  literature. 


4.3.1.  Curves  over  Fields  of  Characteristic  p  >  3:  Assume  that  our  finite  held  is  given  by 
K  =  Fg,  where  q  =  pn  for  a  prime  p  >  3  and  an  integer  n  >  1.  As  mentioned,  the  curve  equation 
in  this  case  can  be  simplified  to  the  short  Weierstrass  form 

E:  Y2  =  X3  +  a-  X  +  b. 


The  discriminant  of  the  curve  then  reduces  to  A  =  — 16  •  (4  •  a3  +  27  •  b2),  and  its  /-invariant  to 
j(E)  =  —1728  •  (4  •  a)3/ A.  The  formulae  for  the  group  law  in  Lemma  4.2  also  simplify  to 

-Pi  =  Ui,-2/i), 

and  if 

Ps  =  (#3?  2/3)  =  Pi  +  P2  7^  O, 
then  X3  and  1/3  are  given  by  the  formulae 

X3  =  A2  —  X\  —  X'2 , 

V3  =  (x\  -  X3)  •  A  -  2/1, 


where  if  x\  7^  X2  we  set 


and  if  x\  —  X2,  y\  7^  0  we  set 


A  =  2/2  -yi 

X2  —  X\  7 


A 


3  •  x\  +  a 

2  •  y\ 


4.3.2.  Curves  over  Fields  of  Characteristic  Two:  We  now  specialize  to  the  case  of  finite  fields 
where  q  —  2n  with  n  >  1.  In  this  case,  the  expression  for  the  j-invariant  reduces  to  j ( E )  =  a}2/ A.  In 
characteristic  two,  the  condition  j(E)  =  0,  i.e.  a\  =  0,  is  equivalent  to  the  curve  being  supersingular. 
As  mentioned  earlier,  this  very  special  type  of  curve  is  avoided  in  cryptography.  We  assume, 
therefore,  that  j(E)  /  0. 

Under  these  assumptions,  a  representative  for  each  isomorphism  class  of  elliptic  curves  over  ¥q 
is  given  by 

(6)  E:  Y2  +  X-Y  =  X3  +  a2-X2  +  a6, 

where  clq  G  F*  and  a 2  G  {0,7}  with  7  a  fixed  element  in  Fg  such  that  Trg|2(7)  =  1,  where  Trg|2  is 
the  absolute  trace 

71—1 

Tr2n|2(a)  =  y>2‘- 

4=0 
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The  formulae  for  the  group  law  in  Lemma  4.2  then  simplify  to 

-Pi  =  (xi,yi  +  xi), 


and  if 

Ps  =  (^3,  ys)  =  Pi  +  P2  O, 

then  T3  and  are  given  by  the  formulae 


where  if  x\  7^x2  we  set 


and  if  xi  =  X2  7^  0  we  set 


9 

X3  =  A  +  A  +  CJ2+X1  +  %2i 

2/3  =  (A  +  1)  •  x$  +  11 

=  (x\  +  x3)  ■  A  +  x3  +yi, 


,  _  2/2  +  2/1 
A  , 

X2  +  Xi 

Vl-  X2+V2-  X! 
M  =  - - - 

X2  +  X! 


A 


xj  +2/1 

X\ 


11 


—  X 


2 

1* 


4.4.  Projective  Coordinates 

One  of  the  problems  with  the  above  formulae  for  the  group  laws,  given  in  both  large  and  even 
characteristic,  is  that  at  some  stage  they  involve  a  division  operation.  Division  in  finite  fields  is 
considered  to  be  an  expensive  operation,  since  it  usually  involves  some  variant  of  the  extended 
Euclidean  algorithm,  which  although  of  approximately  the  same  complexity  as  multiplication  can 
usually  not  be  implemented  as  efficiently. 

To  avoid  these  division  operations  one  can  use  projective  coordinates.  Here  one  writes  the 
elliptic  curve  using  three  variables  (X,  T,  Z)  instead  of  just  (X,  Y).  Instead  of  using  the  projective 
representation  given  at  the  start  of  this  chapter  we  instead  use  one  where  the  curve  is  written  as 

E  :  Y2  +  ai- X  -Y  ■  Z +  a2-Y  ■  Z4  =  X3  +a2- X2  ■  Z2  +  a4- X  ■  Z4 +a6-  Z6. 

The  point  at  infinity  is  still  denoted  by  (0, 1,0),  but  now  the  map  from  projective  to  affine  coordi- 
nates  is  given  by 

(X,  Y,  Z)  I — >  (. X/Z 2,  Y/Z3). 

This  choice  of  projective  coordinates  is  made  to  provide  a  more  efficient  arithmetic  operation. 

4.4.1.  Large  Prime  Characteristic:  The  formulae  for  point  addition  when  our  elliptic  curve  is 
written  as 

E:  Y2  =  X3  +  a  ■  X  ■  Z4  +  b  ■  Ze 


are  now  given  by  the  law 


(X3,  y3;  z3)  =  (WU,  U)  +  (X2,  y2,  Z2) 


4.5.  POINT  COMPRESSION 
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where  (X3,  Y3,  Z3)  are  derived  from  the  formulae 


A3 


Xy 

*  ^2  5 

a2  = 

X2 

* 

Ai 

-  a2, 

a4  = 

Yy 

^2  5 

y2- 

Z3 

A6  = 

a4- 

-  A5, 

Ai- 

4-  A2, 

As  = 

A4  +  A5, 

Zy 

•  Z2  * 

A3, 

X3  = 

\2 
-  Ag 

—  A7  • 

\2 

A3' 

A7  ' 

w  to 

2  •  x3, 

v  = 

(Ag 

•  Ag  - 

-  Ag 

Ag 

Notice  the  avoidance  of  any  division  operation,  bar  division  by  2  which  can  be  easily  accomplished 
by  multiplication  of  the  precomputed  value  of  2~1  (mod  p).  Doubling  a  point, 

(x3,y3,z3)  =  [2](x1,yi,z1), 

can  be  accomplished  using  the  formulae 

Ai  =  3  -Xf+a-Zf,  Z3  =  2  •  U  •  Zi, 

A2 

Y3  =  Ai  •  (A2  -  X3)  -  A3. 


A2  =  4  •  Xi  •  Y?. 
As  =  8  •  If, 


A^3  =  A^  —  2  •  A2. 


4.4.2.  Even  Characteristic:  In  even  characteristic  we  write  our  elliptic  curve  in  the  form 

E  :  Y2  +  X  •  Y  •  Z  =  X3  +  a2  •  X2  •  Z2  +  a6  •  Z6. 

Point  addition, 

(x3,  y3,  ^3)  =  (*i,  Fi,  Zi)  +  (X2,Y2,  Z2) 

is  now  accomplished  using  the  recipe 


Ai  =  Xy  •  Zf , 

A3  =  Ai  +  A2, 

a5  =  y2  ■  zl 

A7  =  Z\  •  A3, 

Z3  =  A7  •  z2, 

A3  =  a2  '  Z2  +  Ag  •  A9  +  A3. 
Doubling  is  performed  using 


A2  =  x2  •  z2, 

A  4  =  U  •  Zl 
A6  =  A4  +  A5, 

As  —  Ag  •  X2  +  A7  •  y2; 
Xq  =  Xq  ~\~  Z3l 

Y3  =  A9  •  A3  +  A8  •  X2. 

42\4 


z3  =  x1-zl  x3  =  (X!  +  <k  •  z'(y 

A  =  Z3YX2YY1-Zu  Y3=Xf-Z3  +  A-  a3, 

where  dg  =  ag •  Notice  how  in  both  even  and  odd  characteristic  we  have  avoided  a  division 

operation  when  performing  curve  operations. 


4.5.  Point  Compression 

In  many  cryptographic  protocols  we  need  to  store  or  transmit  an  elliptic  curve  point.  Using  affine 
coordinates  this  can  be  accomplished  using  two  held  elements,  i.e.  by  transmitting  x  and  then  y. 
However,  one  can  do  better  using  a  technique  called  point  compression.  Point  compression  is  based 
on  the  observation  that  for  every  x-coordinate  on  the  curve  there  are  at  most  two  corresponding 
//-coordinates.  Hence,  we  can  represent  a  point  by  storing  the  x-coordinate  along  with  a  bit  b  to 
say  which  value  of  the  //-coordinate  we  should  take.  All  that  remains  to  decide  is  how  to  compute 
the  bit  b  and  how  to  reconstruct  the  //-coordinate  given  the  x-coordinate  and  the  bit  b. 
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Large  Prime  Characteristic:  For  elliptic  curves  over  fields  of  large  prime  characteristic  we  notice 
that  if  a  G  F*  is  a  square,  then  the  two  square  roots  ±/3  of  a  have  different  parities  when  represented 
as  integers  in  the  range  [1, . . .  ,_p  —  1].  This  is  because  —/3  =  p  —  ft.  Hence,  as  the  bit  6  we  choose 
the  parity  of  the  //-coordinate.  Given  (x,6),  we  can  reconstruct  y  by  computing 

(3  =  +  a  •  x  +  b  (mod  p). 

If  the  parity  of  f3  is  equal  to  b  we  set  y  =  /?,  otherwise  we  set  y  =  p  —  (3.  If  /?  =  0  then  no  matter 
which  value  of  b  we  have  we  set  y  —  0. 

As  an  example  consider  the  curve 

E:  Y2  =  X3  +  X +  3 


over  the  field  F7.  Then  the  points  (4, 1)  and  (4,6)  which  in  bits  we  need  to  represent  as 

(06100, 06001)  and  (06100,06110), 
i.e.  requiring  six  bits  for  each  point,  can  be  represented  as 

(06100,061)  and  (06100,060), 

where  we  only  use  four  bits  for  each  point.  In  larger,  cryptographically  interesting,  examples  the 
advantage  becomes  more  pronounced.  For  example  consider  the  curve  with  the  same  coefficients 
but  over  the  finite  field  ¥p  where 

p=  1 125  899  906  842  679  =  250  +  55 


then  the  point 


(1  125  899  906  842  675,  245  132  605  757  739) 


can  be  represented  by  the  integers 


(1  125  899  906  842  675,  1). 


So  instead  of  requiring  102  bits  we  only  require  52  bits. 


Even  Characteristic:  In  even  characteristic  we  need  to  be  slightly  more  clever.  Suppose  we  are 
given  a  point  P  —  (x,y)  on  the  elliptic  curve 

E:  Y2  +  X  ■  Y  =  X3  +  a2  ■  X  +  a6. 

If  y  =  0  then  we  set  6  =  0,  otherwise  we  compute 

z  =  y/x 

and  let  6  denote  the  least  significant  bit  of  z.  To  recover  y  given  (x,  6),  for  x  7^  0,  we  set 

a  6 

ex  =  x  -\~  a  2  H — o 

xz 

and  let  f3  denote  a  solution  of 

z2  +  z  =  a. 

Then  if  the  least  significant  bit  of  [3  is  equal  to  6  we  set  y  =  x  •  /?,  otherwise  we  set  y  =  x  •  (/?  +  1). 
To  see  why  this  works  notice  that  if  (x,y)  is  a  solution  of 

E  :  Y2 +  XY  =  X3  +  a2- X2 +  a6 

then  (.r,  y/x)  and  (:/,',  1  +  y/x )  are  the  two  solutions  of 

Z2  +  Z  =  X  +  a2  +  ^. 


4.6.  CHOOSING  AN  ELLIPTIC  CURVE 
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4.6.  Choosing  an  Elliptic  Curve 

One  of  the  advantages  of  elliptic  curves  is  that  there  is  a  very  large  number  of  possible  groups.  One 
can  choose  both  the  finite  held  and  the  coefficients  of  the  curve.  In  addition  finding  elliptic  curves 
with  the  correct  cryptographic  properties  to  make  the  systems  using  them  secure  is  relatively  easy; 
we  just  have  to  know  which  curves  to  avoid. 

For  any  elliptic  curve  and  any  finite  held  the  group  order  ¥q)  can  be  computed  in  polynomial 
time.  But  this  is  usually  done  via  a  complicated  algorithm  that  we  cannot  go  into  in  this  book. 
Hence,  you  should  just  remember  that  computing  the  group  order  is  computationally  easy.  As 
we  saw  in  Chapter  3,  when  considering  algorithms  to  solve  discrete  logarithm  problems,  knowing 
the  group  order  is  important  in  understanding  how  secure  a  group  is.  For  some  elliptic  curves 
computing  the  group  order  is  easy;  in  particular  supersingular  curves.  The  curve  E(¥q)  is  said  to 
be  supersingular  if  the  characteristic  p  divides  the  trace  of  Frobenius,  t.  If  q  =  p  then  this  means 
that  E(Fp)  has  p  +  1  points  since  we  must  have  t  —  0.  For  other  hnite  helds  the  possible  values  of 
t  corresponding  to  supersingular  elliptic  curves  are  given  by,  where  q  =  pf , 

•  /  odd:  t  =  0,  t2  =  2q  and  t 2  =  3 q. 

•  /  even:  t 2  =  4 g,  t2  =  q  if  p  =  1  (mod  3)  and  t  =  0  if  p  ^  1  (mod  4). 

For  elliptic  curves  there  are  no  known  sub-exponential  methods  for  the  discrete  logarithm 
problem,  except  in  certain  special  cases.  There  are  three  particular  classes  of  curves  which,  under 
certain  conditions,  will  prove  to  be  cryptographically  weak: 

•  The  curve  E(¥q)  is  said  to  be  anomalous  if  its  trace  of  Frobenius  is  one,  giving  #E(Fq)  =  q. 
These  curves  are  weak  when  q  =  p,  the  held  characteristic.  In  this  case  there  is  an  algorithm 
to  solve  the  discrete  logarithm  problem  which  requires  O(logp)  elliptic  curve  operations. 

•  For  any  q  we  must  choose  curves  for  which  there  is  no  small  number  t  such  that  r  divides 
q1  —  1,  where  r  is  the  large  prime  factor  of  #E(¥q).  This  alas  eliminates  the  supersingular 
curves  and  a  few  others.  In  this  case  there  is  a  simple  computable  mapping  from  the 
elliptic  curve  discrete  logarithm  problem  to  the  discrete  logarithm  problem  in  the  hnite 
held  F  ,t .  Hence,  in  this  case  we  obtain  a  sub-exponential  method  for  solving  the  elliptic 
curve  discrete  logarithm  problem. 

•  If  q  —  2n  then  we  usually  assume  that  n  is  prime  to  avoid  the  possibility  of  certain  attacks 
based  on  the  concept  of  “Weil  descent” . 

One  should  treat  these  three  special  cases  much  like  one  treats  the  generation  of  large  integers  for 
the  RSA  algorithm.  Due  to  the  P  —  1  factoring  method  one  often  makes  RSA  moduli  N  —  p  •  q 
such  that  p  is  a  so-called  safe  prime  of  the  form  2pi  + 1.  Another  special  RSA-based  case  is  that  we 
almost  always  use  RSA  with  a  modulus  having  two  prime  factors,  rather  than  three  or  four.  This 
is  because  moduli  with  two  prime  factors  appear  to  be  the  hardest  to  factor. 

It  turns  out  that  the  only  known  practical  method  to  solve  the  discrete  logarithm  problem  in 
general  elliptic  curves  is  the  parallel  version  of  Pollard’s  Rho  method  given  in  Chapter  3.  Thus  we 
need  to  choose  a  curve  such  that  the  group  order  #E( ¥q)  is  divisible  by  a  large  prime  number  r, 
and  for  which  the  curve  is  not  considered  weak  by  the  above  considerations.  Hence,  from  now  on 
we  suppose  the  elliptic  curve  E  is  defined  over  the  hnite  held  ¥q  and  we  have 

#E( Wg)  =  h  ■  r 

where  r  is  a  “large”  prime  number  and  h  is  a  small  number  called  the  cofactor.  By  Hasse’s  Theorem 
4.3  the  value  of  #E(Fq)  is  close  to  q  so  we  typically  choose  a  curve  with  r  close  to  q,  i.e.  we  choose 
a  curve  E  so  that  h  =  1,  2  or  4. 

Since  the  best  general  algorithm  known  for  the  elliptic  curve  discrete  logarithm  problem  is  the 
parallel  Pollard’s  Rho  method,  which  has  complexity  0(y/r),  which  is  about  0(^/q),  to  achieve  the 
same  security  as  a  128-bit  block  cipher  we  need  to  take  q  ~  2256,  which  is  a  lot  smaller  than  the 
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field  size  recommended  for  systems  based  on  the  discrete  logarithm  problems  in  a  finite  held.  This 
results  in  the  reduced  bandwidth  and  computational  times  of  elliptic  curve  systems. 


Chapter  Summary 


•  Elliptic  curves  over  finite  fields  are  another  example  of  a  finite  abelian  group.  There  are 
many  such  groups  since  we  are  free  to  choose  both  the  curve  and  the  held. 

•  For  cryptography  we  need  to  be  able  to  compute  the  number  of  elements  in  the  group. 
Although  this  is  done  using  a  complicated  algorithm,  it  can  be  done  in  polynomial  time. 

•  One  should  usually  avoid  supersingular  and  anomalous  curves  in  cryptographic  applica¬ 
tions. 

•  Efficient  algorithms  for  the  group  law  can  be  produced  by  using  projective  coordinates. 
These  algorithms  avoid  the  need  for  costly  division  operations  in  the  underlying  hnite  held. 

•  To  save  bandwidth  and  space  it  is  possible  to  efficiently  compress  elliptic  curve  points 
(x,  y )  down  to  x  and  a  single  bit  b.  The  uncompression  can  also  be  performed  efficiently. 

•  For  elliptic  curves  there  are  no  sub-exponential  algorithms  known  for  the  discrete  logarithm 
problem,  except  in  very  special  cases.  Hence,  the  only  practical  general  algorithm  to  solve 
the  discrete  logarithm  problem  on  an  elliptic  curve  is  the  parallel  Pollard’s  Rho  method. 


Further  Reading 

Those  who  wish  to  learn  more  about  elliptic  curves  in  general  may  try  the  textbook  by  Silver- 
man  (which  is  really  aimed  at  mathematics  graduate  students).  Those  who  are  simply  interested 
in  the  cryptographic  applications  of  elliptic  curves  and  the  associated  algorithms  and  techniques 
may  see  the  book  by  Blake,  Seroussi  and  Smart  and  its  follow-up  book. 

I.F.  Blake,  G.  Seroussi  and  N.P.  Smart.  Elliptic  Curves  in  Cryptography.  Cambridge  University 
Press,  1999. 

I. F.  Blake,  G.  Seroussi  and  N.P.  Smart.  Advances  in  Elliptic  Curve  Cryptography.  Cambridge 
University  Press,  2004. 

J. H.  Silverman.  The  Arithmetic  of  Elliptic  Curves.  Springer,  1985. 


CHAPTER  5 


Lattices 


Chapter  Goals 


•  To  describe  lattice  basis  reduction  algorithms  and  give  some  examples  of  how  they  are 
used  to  break  cryptographic  systems. 

•  To  introduce  the  hard  problems  of  SVP,  CVP,  BDD  and  their  various  variations. 

•  To  introduce  q- ary  lattices. 

•  To  explain  the  technique  of  Coppersmith  for  finding  small  roots  of  modular  polynomial 
equations. 


5.1.  Lattices  and  Lattice  Reduction 

In  this  chapter  we  present  the  concept  of  a  lattice.  Traditionally  in  cryptography  lattices  have  been 
used  in  cryptanalysis  to  break  systems,  and  we  shall  see  applications  of  this  in  Chapter  15.  However, 
recently  they  have  also  been  used  to  construct  cryptographic  systems  with  special  properties,  as 
we  shall  see  in  Chapter  17. 


5.1.1.  Vector  Spaces:  Before  presenting  lattices,  and  the  technique  of  lattice  basis  reduction,  we 
first  need  to  recap  some  basic  linear  algebra.  Suppose  x  =  (aq,  X2, . . . ,  xn)  is  an  n-dimensional  real 
vector,  i.e.  for  all  i  we  have  aq  E  M.  The  set  of  all  such  vectors  is  denoted  Mn.  On  two  such  vectors 
we  can  define  an  inner  product 

(x,  y)  =  X\  ■  2/1  +  x2  ■  U2  H - f  Xn  ■  yn, 

which  is  a  function  from  pairs  of  n-dimensional  vectors  to  the  real  numbers.  You  probably  learnt 
at  school  that  two  vectors  x  and  y  are  orthogonal,  or  meet  at  right  angles,  if  and  only  if  we  have 

(x,  y)  =  0. 

Given  the  inner  product  we  can  then  define  the  size,  or  length,  of  a  vector  by 


x 


X,  X 


T  T  •  •  •  T  xn. 


This  length  corresponds  to  the  intuitive  notion  of  length  of  vectors;  in  particular  the  length  satisfies 
a  number  of  properties. 

•  1 1 x| |  >  0,  with  equality  if  and  only  if  x  is  the  zero  vector. 

•  Triangle  inequality:  For  two  n-dimensional  vectors  x  and  y 


x  +  y  <  x  +  y 


•  Scaling:  For  a  vector  x  and  a  real  number  a 


a  •  x 
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A  set  of  vectors  {bi, . . . ,  bm}  in  Mn  is  called  linearly  independent  if  the  equation 

Qj\  •  bi  +  •  •  •  +  •  bm  0, 

for  real  numbers  a^,  implies  that  all  ai  are  equal  to  zero.  If  the  set  is  linearly  independent  then  we 
must  have  m  <  n.  Suppose  we  have  a  set  of  m  linearly  independent  vectors,  {bi, . . . ,  bm}.  We  can 
look  at  the  set  of  all  real  linear  combinations  of  these  vectors, 

ai  •  b^  :  ai  G  M  j>  . 

This  is  a  vector  subspace  of  W1  of  dimension  m  and  the  set  {bi, . . .  ,bm}  is  called  a  basis  of  this 
subspace.  If  we  form  the  matrix  B  with  ith  column  of  B  being  equal  to  b^  for  all  i  then  we  have 

V  =  {B  •  a  :  a  G  Mm}. 


The  matrix  B  is  called  the  basis  matrix. 


5.1.2.  The  Gram— Schmidt  Process:  Every  subspace  V  has  a  large  number  of  possible  basis 
matrices.  Given  one  such  basis  it  is  often  required  to  produce  a  basis  with  certain  prescribed  nice 
properties.  Often  in  applications  throughout  science  and  engineering  one  requires  a  basis  which  is 
pairwise  orthogonal,  i.e. 

(bj,bj)  =  0 

for  all  i  7^  j.  Luckily  there  is  a  well-known  method  which  takes  one  basis,  {bi,...,bm}  and 
produces  a  basis  {b^, . . . ,  b^}  which  is  pairwise  orthogonal.  This  method  is  called  the  Gram- 
Schmidt  process  and  the  basis  {b^, . . . ,  b^}  produced  from  {bi, . . . ,  bm}  via  this  process  is  called 
the  Gram-Schmidt  basis  corresponding  to  {bi, . . . ,  bm}.  One  computes  the  b*  from  the  b^  via  the 
recursive  equations 


(bj.b?) 

<b;,b*)' 


for  1  <  j  <  i  <  n, 


i—  1 

K  ^  Hij  •  b* . 

3  = 1 


For  example  if  we  have 


then  we  compute 


b\<r-  bi  = 


b*2  b2 


M  2,1  * 


0 

1 


5 


since 

_(b2,bj)_2_l 
^2>1  (b*,b*)  4  2' 

Notice  how  we  have  (6*,  b\)  =0,  so  the  new  Gram-Schmidt  basis  is  orthogonal. 


5.1.  LATTICES  AND  LATTICE  REDUCTION 
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5.1.3.  Lattices:  A  lattice  L  is  like  the  vector  subspace  V  above,  but  given  a  set  of  basis  vectors 
{bi, . . .  ,bm}  instead  of  taking  the  real  linear  combinations  of  the  b^  we  are  only  allowed  to  take 
the  integer  linear  combinations  of  the  b^, 

{m  5 

ai  •  b^  :  ai  E  Z  >  =  { B  •  a  :  a  E  Zm}. 

The  set  {bi, . . . ,  bm}  is  still  called  the  set  of  basis  vectors  and  the  matrix  B  is  still  called  a  basis 
matrix.  To  see  why  lattices  are  called  lattices,  consider  the  lattice  L  generated  by  the  two  vectors 

bi  =  (  o  )  and  b2  =  y  |  V 

This  is  the  set  of  all  vectors  of  the  form 

f  2 -x  +  y\ 

V  y  )  ’ 

where  x,  y  E  Z.  If  one  plots  these  points  in  the  plane  then  one  sees  that  these  points  form  a 
two-dimensional  lattice.  See  for  example  Figure  5.1. 


Figure  5.1.  A  lattice  with  two  bases  marked.  A  “nice”  one  in  red,  and  a  “bad” 
one  in  blue 


A  lattice  is  a  discrete  version  of  a  vector  subspace.  Since  it  is  discrete  there  is  a  well-defined 
smallest  element,  bar  the  trivially  small  element  of  the  zero  vector  of  course.  This  allows  us  to 
define  the  non-zero  minimum  of  any  lattice  L,  which  we  denote  by 


ML) 


min{ 


:xGf,x  /  0}. 


We  can  also  define  the  successive  minima  A i(T),  which  are  defined  as  the  smallest  radius  r  such  that 
the  n-dimensional  ball  of  radius  r  centred  on  the  origin  contains  i  linearly  independent  lattice  points. 
Many  tasks  in  computing,  and  especially  cryptography,  can  be  reduced  to  trying  to  determine  the 
smallest  non-zero  vector  in  a  lattice.  We  shall  see  some  of  these  applications  later,  but  before 
continuing  with  our  discussion  of  lattices  in  general  we  pause  to  note  that  it  is  generally  considered 
to  be  a  hard  problem  to  determine  the  smallest  non-zero  vector  in  an  arbitrary  lattice.  Later  we 
shall  see  that,  whilst  this  problem  is  hard  in  general,  it  is  in  fact  easy  in  low  dimension,  a  situation 
which  we  shall  use  to  our  advantage  later  on. 
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Just  as  with  vector  subspaces,  given  a  lattice  basis  one  could  ask,  is  there  a  nicer  basis?  For 
example  in  Figure  5.1  the  red  basis  is  certainly  “more  orthogonal”  than  the  blue  basis.  Suppose 
B  is  a  basis  matrix  for  a  lattice  L.  To  obtain  another  basis  matrix  B'  the  only  operation  we  are 
allowed  to  use  is  post-multiplication  of  B  by  a  uni-modular  integer  matrix.  This  means  we  must 
have 

B'  =  B  U 

for  some  integer  matrix  U  with  det  (U)  =  ±1.  This  means  that  the  absolute  value  of  the  determinant 
of  a  basis  matrix  of  a  lattice  is  an  invariant  of  the  lattice,  i.e.  it  does  not  depend  on  the  choice  of 
basis.  Given  a  basis  matrix  B  for  a  lattice  L,  we  call 

A (L)  =  |det(S*  •  B) |1/2 

the  discriminant  of  the  lattice.  If  L  is  a  lattice  of  full  rank,  i.e.  B  is  a  square  matrix,  then  we  have 

A  (L)  =  |det(£)|. 

The  value  A (L)  represents  the  volume  of  the  fundamental  parallelepiped  of  the  lattice  bases;  see  for 
example  Figure  5.2,  which  presents  a  lattice  with  two  parallelepipeds  marked.  Each  parallelepiped 
has  the  same  volume,  as  both  correspond  to  a  basis  of  the  lattice.  If  the  lattice  is  obvious  from  the 
context  we  just  write  A.  From  now  on  we  shall  only  consider  full-rank  lattices,  and  hence  n  =  m, 
and  our  basis  matrices  will  be  square. 


Figure  5.2.  A  lattice  with  two  fundamental  parallelepipeds  marked 
Hermite  showed  that  there  is  an  absolute  constant  yn,  depending  only  on  n,  such  that 

Ai(L)<  VU-A(L)1/”. 

Although  the  value  of  yn  is  only  known  for  1  <  n  <  8,  for  “random  lattices”  the  first  minimum, 
and  hence  Hemite’s  constant  yn,  can  approximated  by  appealing  to  the  Gaussian  Heuristic ,  which 
states  that  for  a  “random  lattice”  we  have 

V  2  •  7r  •  e 

The  classic  result  in  lattice  theory  (a.k.a.  geometry  of  numbers),  is  that  of  Minkowski,  which  relates 
the  minimal  distance  to  the  volume  of  the  fundamental  parallelepiped. 

Theorem  5.1  (Minkowski).  Let  L  be  a  rank-n  lattice  and  C  C  L  be  a  convex  symmetric  body  about 
the  origin  with  volume  Vol(C)  >  2n  •  A (L).  Then  C  contains  a  non-zero  vector  x  <G  L. 
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The  following  immediate  corollary  is  also  often  referred  to  as  “Minkowski’s  Theorem” . 

Corollary  5.2  (Minkowski’s  Theorem).  For  any  n-dimensional  lattice  L  we  have 

Ai  (L)  <  Vn-A(L)1/n. 

The  dual  L*  of  a  lattice  L  is  the  set  of  all  vectors  y  G  R”  such  that  y  ■  xT  G  Z  for  all  x  G  L. 
Given  a  basis  matrix  B  of  L  we  can  compute  the  basis  matrix  B *  of  L*  via  =  (B~1)T.  Hence 
we  have  A(L*)  =  1/A(L).  The  hrst  minimum  of  L  and  the  nth  minimum  of  L*  are  linked  by  the 
transference  theorem  of  Banaszczyk  which  states  that  for  all  n-dimensional  lattices  we  have 

1  <  Ai (L)  •  A n(L*)  <  n. 

Thus  a  lower  bound  on  Ai(L)  can  be  translated  into  an  upper  bound  on  A n(L*)  and  vice  versa,  a 
fact  which  is  used  often  in  the  analysis  of  lattice  algorithms. 


5.1.4.  LLL  Algorithm:  One  could  ask,  given  a  lattice  L  does  there  exist  an  orthogonal  basis? 
In  general  the  answer  to  this  last  question  is  no.  If  one  looks  at  the  Gram-Schmidt  process  in  more 
detail  one  sees  that,  even  if  one  starts  out  with  integer  vectors,  the  coefficients  almost  always 
end  up  not  being  integers.  Hence,  whilst  the  Gram-Schmidt  basis  vectors  span  the  same  vector 
subspace  as  the  original  basis  they  do  not  span  the  same  lattice  as  the  original  basis.  This  is  because 
we  are  not  allowed  to  make  a  change  of  basis  which  consists  of  non-integer  coefficients.  However, 
we  could  try  to  make  a  change  of  basis  so  that  the  new  basis  is  “close”  to  being  orthogonal  in  that 


<  -  for  1  <  7  <  i  <  n. 
~  2  ~J 


These  considerations  led  Lenstra,  Lenstra  and  Lovasz  to  define  the  following  notion  of  reduced 
basis,  called  an  LLL  reduced  basis  after  its  inventors. 


Definition  5.3.  A  basis  {bi, . . .  ,bn}  is  called  LLL  reduced  if  the  associated  Gram-Schmidt  basis 
{b^, . . . ,  b*  }  satisfies 


(7) 

(8) 


1 

IHj  <  2  f°r  1  <  j  <  i  <  n. 


* 


> 


4 


Li  A—  1 


b* 


for  1  <  i  <  n. 


What  is  truly  amazing  about  an  LLL  reduced  basis  is 

•  An  LLL  reduced  basis  can  be  computed  in  polynomial  time;  see  below  for  the  method. 

•  The  hrst  vector  in  the  reduced  basis  is  very  short,  in  fact  it  is  close  to  the  shortest  non-zero 
vector  in  that  for  all  non-zero  xM  we  have 


b  <  2("-l)/2 


<  2"/4  -  A1/”. 


The  constant  2("'  b/2  in  the  second  bullet  point  above  is  a  worst-case  constant,  and  is  called  the 
approximation  factor.  In  practice  for  many  lattices  of  small  dimension  after  one  applies  the  LLL 
algorithm  to  obtain  an  LLL  reduced  basis,  the  hrst  vector  in  the  LLL  reduced  basis  is  in  fact  equal 
to  the  smallest  vector  in  the  lattice.  Hence,  in  such  cases  the  approximation  factor  is  one. 

The  LLL  algorithm  works  as  follows:  We  keep  track  of  a  copy  of  the  current  lattice  basis  B  and 
the  associated  Gram-Schmidt  basis  B*.  At  any  point  in  time  we  are  examining  a  hxed  column  k, 
where  we  start  with  k  =  2. 


•  If  condition  (7)  does  not  hold  for  pkj  with  1  <  j  <  k  then  we  alter  the  basis  B  so  that  it 
does. 

•  If  condition  (8)  does  not  hold  for  column  k  and  column  k  —  1  we  swap  columns  k  and  k  —  1 
around  and  decrease  the  value  of  k  by  one  (unless  k  is  already  equal  to  two).  If  condition 
(8)  holds  then  we  increase  k  by  one. 
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At  some  point  (in  fact  in  polynomial  time)  we  will  obtain  k  =  n  and  the  algorithm  will  terminate. 
To  prove  this,  one  shows  that  the  number  of  iterations  where  one  decreases  k  is  bounded,  and  so 
it  is  guaranteed  that  the  algorithm  will  terminate.  We  will  not  prove  this,  but  note  that  clearly  if 
the  algorithm  terminates  it  will  produce  an  LLL  reduced  basis. 

For  a  Gram-Schmidt  basis  B *  of  a  basis  B  we  define  the  Gram-Schmidt  Log,  GSL(L>),  as  the 
vector 


GSL(-B)  =  flog  i||b*||/A(L)1/™ 


i= 1,- 


It  is  “folklore”  that  the  output  of  the  LLL  algorithm  produces  a  basis  B  whose  GSL(L>)  when 
plotted  looks  like  a  straight  line.  The  (average)  slope  t]b  of  this  line  can  be  then  computed  from 
GSL (B)  via 

VB  :=  (  ,  n  12  ( - T\  '  (  T  *  '  GSL(B)i  )  • 

(■ n+l)-n-(n-l )  J 

The  Geometric  Series  Assumption  (GSA)  is  that  the  output  of  the  LLL  algorithm  does  indeed 
behave  in  this  way  for  a  given  input  basis. 


Example:  As  an  example  we  take  the  above  basis  of  a  two-dimensional  lattice  in  M2 


The  associated  Gram-Schmidt  basis  is  given  by 


But  this  is  not  a  basis  of  the  associated  lattice  since  one  cannot  pass  from  {bi,  b2}  to  {bi,  t>2>  via 
a  unimodular  integer  transformation. 

We  now  apply  the  LLL  algorithm  with  k  =  2  and  find  that  the  first  condition  (7)  is  satisfied 
since  ^2,1  =  However,  the  second  condition  (8)  is  not  satisfied  because 


1 


b* 

d2 


< 


2. 


Hence,  we  need  to  swap  the  two  basis  vectors  around,  to  obtain  the  new  lattice  basis  vectors 


with  the  associated  Gram-Schmidt  basis  vectors 


We  now  go  back  to  the  first  condition  again.  This  time  we  see  that  we  have  \±2.\  —  1  which  violates 
the  first  condition.  To  correct  this  we  subtract  bi  from  the  vector  b2  so  as  to  obtain  /X2,i  =  0.  We 
now  find  the  lattice  basis  is  given  by 


and  the  Gram-Schmidt  basis  is  then  identical  to  this  lattice  basis,  in  that 
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Now  we  are  left  to  check  the  second  condition  again,  we  find 


Hence,  both  conditions  are  satisfied  and  we  conclude  that 


is  an  LLL  reduced  basis  for  the  lattice  L. 


5.1.5.  Continued  Fractions:  We  end  this  section  by  noticing  that  there  is  a  link  between  contin¬ 
ued  fractions  and  short  vectors  in  lattices.  Let  aGl,  and  define  the  following  sequences,  starting 
with  Do  =  a:,  po  =  ao  and  qo  =  1,  p\  =  clq  •  d\  +  1  and  q\  —  ai, 

—  |_Oy  , 

1 

&i+ 1  —  - 5 

OL{  d{ 

Pi  =  di  •  pi- 1  +  Pi- 2  for  i  >  2, 

Qi  =  ai  -  Qi- 1  +  Qi- 2  for  i  >  2. 


The  integers  ao,  ai,  a2, . . .  are  called  the  continued  fraction  expansion  of  <a  and  the  fractions 

Pi [ 

Qi 

are  called  the  convergents.  The  denominators  of  these  convergent s  grow  at  an  exponential  rate 
and  the  convergent  above  is  a  fraction  in  its  lowest  terms  since  one  can  show  gcd (pi,  qi)  =  1  for  all 
values  of  i.  The  important  result  is  that  if  p  and  q  are  two  integers  with 


P 

a - 

Q 


then  |  is  a  convergent  in  the  continued  fraction  expansion  of  a. 

A  similar  effect  can  be  achieved  using  lattices  by  applying  the  LLL  algorithm  to  the  lattice 
generated  by  the  columns  of  the  matrix 


1 

C -a 


for  some  constant  C .  This  is  because  the  lattice  L  contains  the  “short”  vector 

q  \  =  f  1  0 

C  •  (q  •  a  —  p)  J  yC-a  —  C 

Therefore,  in  some  sense  we  can  consider  the  LLL  algorithm  to  be  a  multi-dimensional  generalization 
of  the  continued  fraction  algorithm. 


5.2.  “Hard”  Lattice  Problems 

There  are  two  basic  hard  problems  associated  with  a  lattice.  The  first  is  the  shortest  vector  problem 
and  the  second  is  the  closest  vector  problem.  However,  these  problems  are  only  hard  for  large 
dimensions,  since  for  large  dimension  the  approximation  factor  of  the  LLL  algorithm,  2^n_1^2, 
becomes  too  large  and  the  LLL  algorithm  is  no  longer  able  to  find  a  good  basis. 
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Figure  5.3.  An  SVP  solution 


5.2.1.  Shortest  Vector  Problem:  The  simplest,  and  most  famous,  hard  problem  in  a  lattice  is 
to  determine  the  shortest  vector  within  the  lattice.  This  problem  comes  in  a  number  of  variants: 

Definition  5.4  (Shortest  Vector  Problem).  Given  a  lattice  basis  B  there  are  three  variants  of  this 
problem: 


The  shortest  vector  problem  SVP  is  to  find  a  non-zero  vector  x  in  the  lattice  L  generated 
by  B  for  which 

< 


x 


for  all  non-zero  y  E  L,  i.e. 


x 


Ai(L). 


The  approximate- SVP  problem  SVP7  is  to  find  a  x  such  that 


<  7  -  Ai(L), 


for  some  “small”  constant  7. 

•  The  7- unique  SVP  problem  uSVP7  is  given  a  lattice  and  a  constant  7  >  1  such  that 
A 2(0  >  7  •  Ai(L),  find  a  non-zero  x  E  L  of  length  Ai(L). 

See  Figure  5.3  for  an  example  two-dimensional  lattice,  the  input  basis,  and  the  two  shortest  lattice 
vectors  which  an  SVP  solver  should  find.  Note  that  a  short  lattice  vector  is  not  unique,  since  if 
x  E  L  then  we  also  have  — x  E  L.  The  LLL  algorithm  will  heuristically  solve  the  SVP,  and  for 
large  dimension  will  solve  the  approximate- SVP  problem  with  a  value  of  7  of  2^n_1^2  in  the  worst 
case.  The  7-unique  SVP  problem  is  potentially  easier  than  the  others,  since  we  are  given  more 
information  about  the  underlying  lattice. 


5.2.2.  Closest  Vector  Problem:  We  now  present  the  second  most  important  problem  in  lattices, 
namely  the  closest  vector  problem.  See  Figure  5.4  for  an  example  two-dimensional  lattice,  the  input 
basis,  the  target  vector  x  in  blue,  and  the  closest  lattice  vector  y  in  red. 

Definition  5.5  (Closest  Vector  Problem).  Given  a  lattice  basis  B  generating  a  lattice  L  in  n- 
dimensional  real  space,  and  a  vector  x  E  W1  such  that  x  0  L,  there  are  two  variants  of  this 
problem: 

•  The  closest  vector  problem  CVP  is  to  find  a  lattice  vector  y  E  L  such  that 


x 


< 


X 


z 


for  all  z  E  L. 
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Figure  5.4.  A  CVP  solution 


The  approximate- CVP  problem  CVP7  is  to  find  a  y  such  that 


lx  -  y II  <  7 


x  —  z 


for  all  z  G  L,  for  some  “small”  constant  7. 

The  SVP  and  CVP  problems  are  related  in  that,  in  practice,  we  can  turn  an  algorithm  which  finds 
approximate- S VP  solutions  into  one  which  finds  approximate- CVP  solutions  as  follows.  Let  (F>,  x) 
denote  the  input  to  the  CVP  problem,  where  without  loss  of  generality  we  assume  that  B  is  an 
n  x  n  matrix,  i.e.  the  lattice  is  full  rank  and  in  Mn.  We  now  form  the  lattice  in  Mn+1  generated  by 
the  columns  of  the  matrix 

B'  = 


B 

X 

0 

C 

where  C  is  some  “large”  constant.  We  let  b7  denote  a  short  vector  in  the  lattice  generated  by  B' , 
so  we  have  that 

b7  = 

where  y  is  in  the  lattice  generated  by  L,  i.e.  y  =  B  •  a  for  a  G  Zn.  We  then  have  that 


b7 


y  +  t  •  x  1 1 2  +  t2  •  C: 


Now  since  ||b7||  is  “small”  and  C  is  “large”  we  expect  that  t  =  ±1.  Without  loss  of  generality  we 
can  assume  that  t  —  —  1.  But  this  then  implies  that  ||y  —  x 1 1 2  is  very  small,  and  so  y  is  highly  likely 
to  be  an  approximate  solution  to  the  original  closest  vector  problem. 


5.2.3.  Bounded-Distance  Decoding  Problem:  There  is  a  “simpler”  problem  (meaning  we  give 
the  solver  of  the  problem  more  information)  associated  with  the  closest  vector  problem  called  the 
Bounded-Distance  Decoding  problem  (or  BDD  problem).  Here  we  not  only  give  the  adversary  the 
lattice  and  a  non-lattice  vector,  but  we  also  give  the  adversary  a  bound  on  the  distance  between 
the  lattice  and  the  non-lattice  vector.  Thus  we  are  giving  the  adversary  more  information  than  in 
the  above  two  CVP  problems;  thus  the  BDD  problem  is  akin  to  the  7- unique  SVP  problem. 

Definition  5.6  (Bounded-Distance  Decoding).  Given  a  lattice  basis  B  generating  a  lattice  L  in 
n- dimensional  real  space,  a  vector  x  G  Mn  with  x  0  L,  and  a  real  number  A.  The  Bounded- Distance 
Decoding  problem  BDD^  is  to  find  a  lattice  vector  y  G  L  such  that 


x-y 


<  A-  Ai(L). 
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If  the  shortest  non-zero  vector  in  the  lattice  has  size  Ai(L),  then  if  we  have  A  <  1/2  the  solution  to 
the  BDD  problem  is  guaranteed  to  be  unique.  An  interesting  aspect  of  the  BDD  problem  is  that  if 
the  vector  x  is  known  to  be  really  close  to  the  lattice,  i.e.  A  is  much  smaller  than  one  would  expect 
for  a  random  value  of  x,  then  solving  the  BDD  problem  becomes  easier. 

The  BDD  problem  will  be  very  important  later  in  Chapter  17  so  we  now  discuss  how  hard  BDD 
is  in  relation  to  other  more  standard  lattice  problems.  The  traditional  “practical”  method  of  solving 
BDDa  is  to  embed  the  problem  into  a  uSVP7  problem  of  a  lattice  of  dimension  one  larger  than  the 
original  problem.  This  is  exactly  our  strategy  above  for  solving  CVP.  For  BDD  we  can  formalize 
this  and  show  that  the  two  problems  are  (essentially)  equivalent.  We  present  the  reduction  from 
BDDa  to  uSVP7,  since  we  can  use  lattice  reduction  algorithms  as  a  uSVP7  oracle,  and  hence  the 
reduction  in  this  direction  is  more  practically  relevant.  We  simplify  things  by  assuming  that  the 
distance  in  the  BDDa  problem  is  exactly  known.  Note  that  a  more  complex  algorithm  to  that  given 
in  the  proof  can  deal  with  the  case  where  this  distance  is  not  known,  but  is  just  known  to  be  less 
than  a  •  Ai(L).  In  any  case,  in  many  examples  the  distance  will  be  known  with  a  high  degree  of 
certainty. 

Theorem  5.7.  Given  a  lattice  L  via  a  basis  B  and  a  vector  x  G  Mn,  with  x  0  L,  such  that  the 
minimum  distance  from  x  to  a  lattice  point  dist(x,  L)  =  g  <  a-  Ai(L),  and  an  oracle  for  uSVP7  for 
7  =  1/(2  -  a)  then,  assuming  g  is  known,  there  is  an  algorithm  which  will  output  y  such  that  y  G  L 
and  1 1 y  —  x||  =  dist(x,  L). 


Proof.  First  define  the  matrix 


x 

li 


5 


and  consider  the  lattice  L'  generated  by  the  matrix  B' .  For  the  target  vector  y  define  z  by  y  =  B  -z, 
where  z  G  Zn.  Consider  the  vector  y7  G  L'  defined  by 


We  have  1 1 y 7 1 1  =  y/ g2  g2  =  >/2  •  g  by  assumption  that  dist(x,  L )  =  g.  If  we  pass  the  basis  B'  to 
our  uSVP7  oracle  then  this  will  output  a  vector  v7.  We  would  like  v7  =  y7  i.e.  ||y7||  =  Ai(L7)  and 
all  other  vectors  in  L'  are  either  a  multiple  of  y7  or  have  length  greater  than  7-  ||y'||  =  yT/x-y.  If 
this  is  true  we  can  solve  our  BDDa  problem  by  taking  the  first  n  coordinates  of  y7  and  adding  the 
vector  x  to  the  result  so  as  to  obtain  the  solution  y. 

So  assume,  for  sake  of  contradiction,  that  w7  is  a  vector  in  L'  of  length  less  than  y/2  •  g  •  7  and 
that  w7  is  not  a  multiple  of  y7.  We  can  write,  for  zi  G  Zn  and  f3  G  Z, 


w 


B'  ■  (zi,-/3)t  =  (B  ■  zi  -  /?  •  x,  -f3  ■  /i) 


T 


where  w  =  B  •  zi  £  L.  So  we  have  7||w  —  /?  •  x||2  +  ( f3  •  /u)2  = 

<  \/2 


(w  /3  •  x,  /3  n)J 

<  x/2  •  g  •  7,  which  implies  that 


w 


w  —  p 


g2  •  72  -  p2  •  g: 


Now  consider  the  vector  w  —  (3  •  y  G  L.  Since  w7  is  not  a  multiple  of  y7,  neither  is  w  a  multiple 
of  y,  and  so  w  -  /f  •  y  /  0.  We  wish  to  upper  bound  the  length  of  w  —  fd  •  y: 

W  -  /?  •  y II  =  ||(w  -  /3  ■  x)  -  /?  •  (y  -  x)|| 

<  ||w  —  p  •  x||  +  /3  •  1 1 y  —  x 1 1 

<  •  g2  •  72  —  fd2  •  g2  +  /?•//. 


Now  maximizing  the  right  hand  side  by  varying  /?,  we  find  the  maximum  is  2*7 - g,  which  is  achieved 
when  p  =  7  for  real  /?,  and  hence  we  will  have  ||w  —  /?  •  y  ||  <2*7  •  g  for  integer  values  of  p  as  well. 
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We  now  use  the  equality  7  =  1/(2  -  a)  and  the  inequality  fi  <  a  •  Ai (L)  to  obtain  that 


w  —  f3  •  y  <2 


1 


2  •  a 


(a-Ai(L))  =  Ai(L). 


Thus  for  all  integer  values  of  [3  we  conclude  that  w  —  f3  •  y  E  T,  that  w-/3-y /  0  (since  w  is  not 
a  multiple  of  y)  and  that  ||w  —  /?  •  y||  <  Ai(L)  which  is  a  contradiction,  since  Ai(L)  is  the  length 


of  the  smallest  non-zero  vector  in  the  lattice. 


□ 


So  if  we  want  to  solve  the  BDDa  problem  we  convert  it  into  a  uSVP i/%a  problem.  We  then  apply 
lattice  basis  reduction  to  the  resulting  uSVP i/2-a  problem,  and  hope  the  resulting  first  basis  element 
allows  us  to  solve  the  original  BDDa  problem.  Since  lattice  reduction  gets  worse  as  the  dimension 
increases  this  means  as  n  increases  we  can  only  solve  BDD  problems  in  which  the  distance  between 
the  target  vector  and  the  lattice  is  very  small.  Alternatively,  if  we  want  BDD  to  be  hard  when  the 
distance  between  the  target  and  the  lattice  is  very  small  then  we  need  the  dimension  to  be  large. 

The  other  approach  in  the  literature  for  solving  BDDa  in  practice  is  to  apply  Babai’s  algorithm 
for  solving  closest  vector  problems.  Babai’s  algorithm  takes  as  input  a  lattice  basis  B  and  a  non¬ 
lattice  vector  x  and  then  outputs  a  lattice  vector  w  such  that  the  “error”  e  =  w  —  x  lies  in  the 
fundamental  parallelepiped  of  the  matrix  B* ,  where  B *  is  the  Gram-Schmidt  basis  associated  with 
B.  In  particular  we  have 


2 


If  the  input  basis  is  reduced  we  expect  this  latter  quantity  to  be  small,  and  hence  the  error  vector 
e  is  itself  small  and  represents  the  distance  to  the  closest  lattice  vector  to  x. 


5.3.  g-ary  Lattices 

Of  importance  in  one  of  the  systems  described  later  are  so-called  q- ary  lattices.  A  q- ary  lattice  L 
is  one  such  that  qZn  C  L  C  Zn  for  some  integer  q.  Note,  that  all  integer  lattices  are  g-ary  lattices 
for  a  value  of  q  which  is  an  integer  multiple  of  A (L).  Our  interest  will  be  in  special  forms  of  g-ary 
lattice  which  are  g-ary  for  a  g- value  much  less  than  the  determinant. 

Suppose  we  are  given  a  matrix  A  £  Z™xm,  with  m  >  n;  we  then  define  the  following  two 
m-dimensional  g-ary  lattices. 

A q(A)  =  {y  £  Zm  :  y  =  AJ  •  z  (mod  g)  for  some  z  £  Zn}, 

A^“(A)  =  {y  £  Zm  :  A  •  y  =  0  (mod  g)}. 

Suppose  we  have  y  £  A q(A)  and  y'  £  A ^(A),  then  we  have  y  =  AT  •  z  (mod  g)  and  A  •  y'  =  0 

(mod  g).  This  implies  that 

yT  •  y'  =  (zT  •  A)  ■  y'  =  zT  ■  (A  ■  y')  G  q  ■  Z. 

Hence,  the  two  lattices  are,  up  to  normalization,  duals  of  each  other.  We  have  A q(A)  =  q  •  A ^(A)* 
and  A q(A)  =  g  •  A q(A)*. 

Example:  To  fix  ideas,  consider  the  following  example;  Let  n  =  2,  nn  —  3,  q  =  1009  and  set 


A 


1  2  3 
3  5  6 
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To  define  a  basis  B  of  Aq(A)  we  can  take  the  column-Hermite  Normal  Form  (HNF)  of  the  3x5 
matrix  (At  |  q  - 13)  to  obtain 


(  1009  1  336  \ 


B  = 


0 

V  0 


1 

0 


The  basis  of  A q(A)  is  given  by 


B 


1 

-1 


0 

1  ) 


0 

1009 

0 


0  \ 
0 

1009  ) 


\  -336 

The  properties  of  the  above  example  hold  in  general;  namely  if  q  is  prime  and  m  is  a  bit  larger  than 
n  then  we  have  A  (A q(A))  =  qrn~n  and  A(A^(A))  =  qn.  From  this,  using  the  Gaussian  Heuristic, 
we  find  for  A  E  Z™xm  that  we  expect 


Ai(A„(T) 


m 


2  •  7T  •  e 


q 


( m—n)/m 


MAf(T) 


7T7  / 

„n/m 

-  •  q  . 


2  •  7r  •  e 

Another  lattice-based  problem  which  is  of  interest  in  cryptography,  and  is  related  to  these  g-ary 
lattices,  is  the  Short  Integer  Solution  problem  (or  SIS  problem). 

Definition  5.8  (Short  Integer  Solution).  Given  an  integer  q  and  vectors  ai, . . . ,  am  E  (Z/gZ)n  the 
SIS  problem  is  to  find  a  short  z  E  Zm  such  that 

z\  •  ai  H - b  zrn  •  am  =  0  (mod  q). 

Here  “short”  often  means  zi  E  {—1,0, 1}. 

The  SIS  problem  is  related  to  g-ary  lattices  in  the  following  way.  If  we  set  A  =  (ai, . . . ,  a m)  E 
(Z/gZ)nxm  and  set 

A q(A)  =  {z  E  Zm  :  A  •  z  =  0  (mod  g)}  , 
then  the  SIS  problem  becomes  the  shortest  vector  problem  for  the  lattice  A^(A). 

5.4.  Coppersmith’s  Theorem 

In  this  section  we  examine  a  standard  tool  which  is  used  when  one  applies  lattices  to  attack  certain 
systems.  Much  of  the  work  in  this  area  is  derived  from  initial  work  of  Coppersmith,  which  was  later 
simplified  by  Howgrave- Graham.  Coppersmith’s  contribution  was  to  provide  a  method  to  solve  the 
following  problem:  Given  a  polynomial  of  degree  d 

/( x)  =  /o  +  fl  •  X  -f - f  fd- 1  •  Xd~l  +  xd 

over  the  integers  and  the  side  information  that  there  exists  a  root  xq  modulo  N  which  is  small,  say 
x0\  <  Nl/d7  can  one  efficiently  find  the  small  root  xq?  The  answer  is  surprisingly  yes,  and  this 
leads  to  a  number  of  interesting  cryptographic  consequences. 

The  basic  idea  is  to  find  a  polynomial  h(x)  E  Z[x\  which  has  the  same  root  modulo  N  as  the 
target  polynomial  f(x).  This  new  polynomial  h(x)  should  be  small  in  the  sense  that  the  norm  of 
its  coefficients, 

deg  (h) 

M2=  hj 

i= 0 

should  be  small.  If  such  an  h{x)  can  be  found  then  we  can  appeal  to  the  following  lemma. 
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Lemma  5.9.  Let  h(x)  G  Z[x]  denote  a  polynomial  of  degree  at  most  n  and  let  X  and  N  be  positive 
integers.  Suppose 

\\h(X-x)\\  <  N/^Si 

then  if  \xq\  <  X  satisfies 

h(pc o)  =  0  (mod  N) 

then  h(x o)  =  0  over  the  integers  and  not  just  modulo  N. 

Thus  to  solve  our  original  problem  we  need  to  find  the  roots  of  h(X)  over  the  integers,  which  can 
be  done  in  polynomial  time  via  a  variety  of  methods.  Now  we  return  to  our  original  polynomial 
/(x)  of  degree  d  and  notice  that  if 

/(x 0)  =  0  (mod  N) 

then  we  also  have 

f(x o)k  =  0  (mod  Nk). 

Moreover,  if  we  set,  for  some  given  value  of  m, 

gu,v(x)  ^  -xu  •/ (X)v 

then 

9u,v(x o)  =  0  (mod  Nm ) 

for  all  0  <  u  <  d  and  0  <  v  <  m.  We  then  fix  m  and  try  to  find  auv  <E  Z  so  that 


m 


h(x )  =  E  1  E 


a 


u,v 


9u,v  (%) 


u> 0  \v=0 

satisfies  the  conditions  of  the  above  lemma.  In  other  words  we  wish  to  find  integer  values  of  aU) 
so  that  the  resulting  polynomial  h  satisfies 

\\h(X  -x)\\  <  Nm/P<n  (m  +  1), 


with 


m 


& u,v  '  hu,v  {X  •  x) 


h(X  ■  x)  =  Yi  W 

u>0  V v = 0 

This  is  a  minimization  problem  which  can  be  solved  using  lattice  basis  reduction,  as  we  shall  now 
show  in  a  simple  example. 

Example:  Suppose  our  polynomial  f(x)  is  given  by 

f(x)  =  x2  +  a  •  x  +  b 

and  we  wish  to  find  an  xq  such  that 

f(x o)  =  0  (mod  N). 


set 

m  = 

2  in  the 

above  construction  and  compute 

go, 

,o(X. 

■  x) 

G- 

N[ 

2 

") 

9i, 

,o(X- 

■  x) 

<— 

X 

■  N2  ■  x, 

go, 

•  x) 

<— 

b • 

N  +  a-X  ■  N 

•  x  +  N 

■  X2  ■  x2 

") 

gi, 

,i(X- 

•  x) 

<— 

6- 

N  •  X  •  x  +  a  • 

N  -X2 

■x2  +  N 

•X3 

•  x3, 

go, 

»{X- 

■  x) 

<— 

b2 

+  2  •  b  •  a  •  X  • 

x  +  ( a 2 

+  2  •  b)  ■ 

X2  • 

x2  +  2  •  a  •  X3  •  x3  +  X 

gi, 

»{X- 

•  x) 

<— 

b2 

•  X  •  x  +  2  •  b  • 

a-X2  ■ 

x2  +  ( a 2 

+  2- 

b)  •  X3  •  x3  +  2  •  a  •  X4 

4  4 
•  x  . 

_4  , 


We  are  looking  for  a  linear  combination  of  the  above  six  polynomials  such  that  the  resulting 
polynomial  has  small  coefficients.  Hence  we  are  led  to  look  for  small  vectors  in  the  lattice  generated 
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by  the  columns  of  the  following  matrix,  where  each  column  represents  one  of  the  polynomials  above 
and  each  row  represents  a  power  of  x, 

(  N 


A  = 


2 

0 

b-N 

0 

b2 

0 

0 

X  -N2 

a-X -N 

b-N -X 

2  •  a  •  b  •  X 

X  -b2 

0 

0 

N  •  X2 

a  •  N  •  X2 

(a2  +  2-b)-  X2 

2 ■ a ■ b  ■  X2 

0 

0 

0 

N  ^X3 

2-a-X3 

(a2  +  2  •  6)  ■  X3 

0 

0 

0 

0 

X4 

2-a-X 4 

0 

0 

0 

0 

0 

X5 

This  matrix  has  determinant  equal  to 


/ 


det (A)  =  N6  ■  X15, 

and  so  applying  the  LLL  algorithm  to  this  matrix  we  obtain  a  new  lattice  basis  B.  The  first  vector 
bi  in  B  will  satisfy 

|bi  ||  <  26/4  •  det(X)1/6  =  23/2  •  N  ■  Xb/2. 

So  if  we  set  bj  =  A  ■  u,  with  u  =  (mi,  m2,  ■  ■  ■ ,  M6)t,  then  we  form  the  polynomial 

h(x)  =  •  g0,o(x)  +  m2  •  #1, oO)  H - h  u6  ■  g\fi{x) 


then  we  will  have 

\\h(X-x)\\  <  23/2  •  IV  ■  X5/2. 

To  apply  Lemma  5.9  we  will  require  that 

23/2  •  N  ■  Xb/2  <  N2/V 6. 

Hence  by  determining  an  integer  root  of  h(x)  we  will  determine  the  small  root  xq  of  f(x)  modulo 
N ,  assuming  that 

7V2/5 

xo  <  X  ■ 


481/5 ' 


In  particular  this  will  work  when  xq  <  N °-39 


A  similar  technique  can  be  applied  to  any  polynomial  of  degree  d  so  as  to  obtain  the  following. 

Theorem  5.10  (Coppersmith).  Let  f  E  7L[x\  be  a  monic  polynomial  of  degree  d  and  N  an  integer. 
If  there  is  some  root  xq  of  f  modulo  N  such  that  \xq\  <  X  =  N 1/d~e  then  one  can  find  xq  in  time 
polynomial  in  logfV  and  1/e,  for  fixed  values  of  d. 

Similar  considerations  apply  to  polynomials  in  two  variables  the  analogue  of  Lemma  5.9  is  as  follows: 
Lemma  5.11.  Let  h(x,y)  E  Z[x,y\  denote  a  sum  of  at  most  w  monomials  and  suppose 

h(xo,yo)  =  0  (mod  Ne ) 

for  some  positive  integers  N  and  e  where  the  integers  xq  and  yo  satisfy 


Xq 


<  X  and 


y  o 


<  Y 


and 

\\h(X  -  x,Y  •  y)\\  <  Ne / \fw. 
Then  h(xo,yo)  =  0  holds  over  the  integers. 
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However,  the  analogue  of  Theorem  5.10  then  becomes  only  a  heuristic  result. 


Chapter  Summary 


•  Lattices  are  discrete  analogues  of  vector  spaces;  as  such  they  have  a  shortest  non-zero 
vector. 

•  Lattice  basis  reduction  often  allows  us  to  find  the  shortest  non-zero  vector  in  a  given 
lattice,  thus  lattice  reduction  allows  us  to  solve  the  shortest  vector  problem. 

•  Other  lattice  problems,  such  as  the  CVP  and  BDD  problems,  can  also  be  solved  if  we  can 
find  good  bases  of  lattices. 

•  In  small  dimensions  the  LLL  algorithm  works  very  well.  In  larger  dimensions,  whilst  it  is 
fast,  it  does  not  produce  such  a  good  output  lattice. 

•  The  SIS  problem  is  related  to  the  SVP  problem  in  q- ary  lattices. 

•  Coppersmith’s  Theorem  allows  us  to  solve  a  modular  polynomial  equation  when  the  so¬ 
lution  is  known  to  be  small.  The  method  works  by  building  a  lattice  depending  on  the 
polynomial  and  then  applying  lattice  basis  reduction  to  obtain  short  vectors  within  this 
lattice. 


Further  Reading 

A  complete  survey  of  lattice-based  methods  in  cryptography  is  given  in  the  survey  article  by  Nguyen 
and  Stern.  The  main  paper  on  Coppersmith’s  approach  is  by  Coppersmith  himself,  however  the 
approach  was  simplified  somewhat  in  the  paper  of  Howgrave- Graham. 
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P.  Nguyen  and  J.  Stern.  The  two  faces  of  lattices  in  cryptology.  In  CALC  ’01,  LNCS  2146,  146-180, 
Springer,  2001. 

N.  Howgrave- Graham.  Finding  small  roots  of  univariate  modular  equations  revisited.  In  Cryptog¬ 
raphy  and  Coding,  LNCS  1355,  131-142,  Springer,  1997. 


CHAPTER  6 


Implementation  Issues 


Chapter  Goals 

•  To  show  how  exponentiation  algorithms  are  implemented. 

•  To  explain  how  modular  arithmetic  can  be  implemented  efficiently  on  large  numbers. 

•  To  show  how  certain  tricks  can  be  used  to  speed  up  exponentiation  operations. 

•  To  show  how  finite  fields  of  characteristic  two  can  be  implemented  efficiently. 

6.1.  Introduction 

In  this  chapter  we  examine  how  one  actually  implements  cryptographic  operations.  We  shall  mainly 
be  concerned  with  public  key  operations  since  those  are  the  most  complex  to  implement.  For 
example,  when  we  introduce  RSA  or  DSA  later  we  will  have  to  perform  a  modular  exponentiation 
with  respect  to  a  modulus  of  a  thousand  or  more  bits.  This  means  we  need  to  understand  the 
implementation  issues  involved  with  both  modular  arithmetic  and  exponentiation  algorithms. 

There  is  another  reason  to  focus  on  public  key  algorithms  rather  than  private  key  ones:  in 
general  public  key  schemes  run  much  more  slowly  than  symmetric  schemes.  In  fact  they  can  be  so 
slow  that  their  use  can  make  networks  and  web  servers  unusable.  Hence,  efficient  implementation  is 
crucial  unless  one  is  willing  to  pay  a  large  performance  penalty.  The  chapter  focuses  on  algorithms 
used  in  software;  for  hardware-based  algorithms  one  often  uses  different  techniques  entirely. 

6.2.  Exponentiation  Algorithms 

So  far  in  this  book,  e.g.  when  discussing  primality  testing  in  Chapter  2,  we  have  assumed  that 
computing 

y  —  xd  (mod  n) 

is  an  easy  operation.  We  will  also  need  this  operation  both  in  RSA  and  in  systems  based  on 
discrete  logarithms  such  as  ElGamal  encryption  and  DSA.  In  this  section  we  concentrate  on  the 
exponentiation  algorithms  and  assume  that  we  can  perform  modular  arithmetic  efficiently.  In  a 
later  section  we  shall  discuss  how  to  perform  modular  arithmetic.  Firstly  note  it  does  not  make 
sense  to  perform  this  operation  via  the  sequence 

•  Compute  r  <— 

•  Compute  y  <—  r  (mod  n). 

To  see  this,  consider 

1235  (mod  511)  =  28  153  056  843  (mod  511)  =  359. 

With  this  naive  method  one  obtains  a  huge  intermediate  result,  in  our  small  case  above  this  is 

28153  056  843. 

But  in  a  real  2048-bit  exponentiation  this  intermediate  result  would  be  in  general  22048  •  2048  bits 
long.  Such  a  number  requires  over  IQ600  gigabytes  simply  to  write  down. 
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6.2.1.  Binary  Exponentiation:  To  stop  this  explosion  in  the  size  of  any  intermediate  results 
we  use  the  fact  that  we  are  working  modulo  n.  But  even  here  one  needs  to  be  careful;  a  naive 
algorithm  would  compute  the  above  example  by  computing 


x  =  123, 


x 

x 

x 

x 


2 

3 

4 

5 


xxx  (mod  511)  =  310, 
xxx2  (mod  511)  =  316, 

xxx3  (mod  511)  =  32, 

xxx4  (mod  511)  =  359. 


This  requires  four  modular  multiplications,  which  seems  fine  for  our  small  example.  But  for  a  gen¬ 
eral  exponentiation  by  a  2048-bit  exponent  using  this  method  would  require  around  22048  modular 
multiplications.  If  each  such  multiplication  could  be  done  in  under  one  millionth  of  a  second  we 
would  still  require  around  lO600  years  to  perform  this  operation. 

However,  it  is  easy  to  see  that,  even  in  our  small  example,  we  can  reduce  the  number  of  required 
multiplications  by  being  a  little  more  clever: 


x  =  123, 

x2  =  x  x  x  (mod  511)  =  310, 
x4  —  x2  x  x2  (mod  511)  =  32, 
x5  =  x  x  x4  (mod  511)  =  359. 

Which  only  requires  three  modular  multiplications  rather  than  the  previous  four.  To  understand 
why  we  only  require  three  modular  multiplications  notice  that  the  exponent  5  has  binary  represen¬ 
tation  05101  and  so 

•  Has  bit  length  t  =  3, 

•  Has  Hamming  weight  h  =  2. 

In  the  above  example  we  required  1  =  (h  —  1)  general  multiplications  and  2  =  (t  —  1)  squarings. 
This  fact  holds  in  general,  in  that  a  modular  exponentiation  can  be  performed  using 

•  (h  —  1)  multiplications, 

•  (t  —  1)  squarings, 

where  t  is  the  bit  length  of  the  exponent  and  h  is  the  Hamming  weight.  The  average  Hamming 
weight  of  an  integer  is  t/2  so  the  number  of  multiplications  and  squarings  is  on  average 

t  +  t/2-  1. 

For  a  2048-bit  modulus  this  means  that  the  average  number  of  modular  multiplications  needed  to 
perform  exponentiation  by  a  2048-bit  exponent  is  at  most  4096  and  on  average  3072. 


Algorithm  6.1:  Binary  exponentiation:  Right-to-left  variant 
V  4—  1- 

while  d  ^  0  do 

if  (d  mod  2)  7^  0  then 
y  <—  (y  •  x)  mod  n. 
d  i —  d  —  1 . 

d  <—  d/2. 

x  <—  (x  •  x)  mod  n. 
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The  method  used  to  achieve  this  improvement  in  performance  is  called  the  binary  exponentiation 
method.  This  is  because  it  works  by  reading  each  bit  of  the  binary  representation  of  the  exponent 
in  turn,  starting  with  the  least  significant  bit  and  working  up  to  the  most  significant  bit.  Algorithm 
6.1  explains  the  method  by  computing 

y  —  xd  (mod  n). 

The  above  binary  exponentiation  algorithm  has  a  number  of  different  names:  some  authors  call  it 
the  square  and  multiply  algorithm,  since  it  proceeds  by  a  sequence  of  squarings  and  multiplications, 
other  authors  call  it  the  Indian  exponentiation  algorithm.  Algorithm  6.1  is  called  a  right-to-left 
exponentiation  algorithm  since  it  processes  the  bits  of  d  from  the  least  significant  bit  (the  right 
one)  up  to  the  most  significant  bit  (the  left  one). 

6.2.2.  Window  Exponentiation  Methods:  Most  of  the  time  it  is  faster  to  perform  a  squaring 
operation  than  a  general  multiplication.  Hence  to  reduce  time  even  more  one  tries  to  reduce  the 
total  number  of  modular  multiplications  even  further.  This  is  done  using  window  techniques  which 
trade  off  precomputations  (i.e.  storage)  against  the  time  in  the  main  loop. 

To  understand  window  methods  better  we  first  examine  the  binary  exponentiation  method 
again.  But  this  time  instead  of  a  right-to-left  variant,  we  process  the  exponent  from  the  most 
significant  bit  first,  thus  producing  a  left-to-right  binary  exponentiation  algorithm;  see  Algorithm 

6.2.  Again  we  assume  we  wish  to  compute 

y  =  xd  (mod  n). 

We  first  give  a  notation  for  the  binary  representation  of  the  exponent 

t 

d=Y^dr  2\ 

2=0 

where  di  G  {0, 1}.  The  algorithm  processes  a  single  bit  of  the  exponent  on  every  iteration  of  the 
loop.  Again  the  number  of  squarings  is  equal  to  t  and  the  expected  number  of  multiplications  is 
equal  to  tj 2. 


Algorithm  6.2:  Binary  exponentiation:  Left-to-right  variant 
V  1- 

for  i  —  t  downto  0  do 

y  (y '  y)  mod  n. 

if  di  =  1  then  y  <—  (y  •  x)  mod  n. 


In  a  window  method  we  process  w  bits  of  the  exponent  at  a  time,  as  in  Algorithm  6.3.  We  first 
precompute  a  table 

Xi  =  x1  (mod  n)  for  i  =  0, . . . ,  2W  —  1. 

Then  we  write  our  exponent  out,  but  this  time  taking  w  bits  at  a  time, 

t/w 

d  =  di  ■  2i'w, 

2=0 

where  di  G  {0, 1,  2, . . . ,  2W  —  1}. 

It  is  perhaps  easier  to  illustrate  this  with  an  example.  Suppose  we  wish  to  compute 

y  =  x215  (mod  n ) 

with  a  window  width  of  w  =  3.  We  compute  the  di  as 

215  =  3  •  26  +  2  •  23  +  7. 
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Algorithm  6.3:  Window  exponentiation  method 

y  1- 

for  i  =  t/w  downto  0  do 

for  j  =  0  to  w  —  1  do  y  <—  (y  •  y)  mod  n. 
j  <—  di. 

y  <—  (y  •  Xj )  mod  n. 


Hence,  our  iteration  to  compute  x215  (mod  n)  computes  in  order 


y 

y 

y 

y 

y 

y 


l, 


y  •  x  —  xK 

y8  =  x24, 

y  •  x2  =  x 


26 


z/8 = z/208, 

y  -  x7  =  x215. 


6.2.3.  Sliding  Window  Method:  With  a  window  method  as  above,  we  still  perform  t  squarings 
but  the  number  of  multiplications  reduces  to  t/w  on  average.  One  can  do  even  better  by  adopting 
a  sliding  window  method,  where  we  now  encode  our  exponent  as 

i 

d=J2di'2ei 

i= 0 

where  d{  G  {1,3,5,...  ,2™  —  1}  and  —  e*  >  w.  By  choosing  only  odd  values  for  d{  and  having  a 
variable  window  width  we  achieve  both  decreased  storage  for  the  precomputed  values  and  improved 
efficiency.  After  precomputing  00  2  —  00  for  i  —  1,  3,  5, . . . ,  2W  —  1,  we  execute  Algorithm  6.4. 


Algorithm  6.4:  Sliding  window  exponentiation 

y  1- 

for  i  =  l  downto  0  do 

for  j  —  0  to  e^+i  —  ei  —  1  do  y  (y  •  y)  mod  n. 

j  <—  d{. 

y  <—  (y  •  Xj)  mod  n. 

for  j  —  0  to  eo  —  1  do  y  (y  •  y)  mod  n. 


The  number  of  squarings  remains  again  at  £,  but  now  the  number  of  multiplications  reduces  to 
Z,  which  is  about  t/(w  +  1)  on  average.  In  our  example  of  computing  y  =  x215  (mod  n)  we  have 


215  =  27  +  5  •  24  +  7, 
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and  so  we  execute  the  steps 


y 

y 

y 

y 

y 

y 


l, 


=  y  •  x  =  x 


y 


8 


X 


8 


6  1 3 

y  •  x'  =  x  , 


y 


16 


X 


208 


y  -  x7  =  x 215 , 


6.2.4.  Generalizations  to  Any  Group:  Notice  that  all  of  the  above  window  algorithms  apply 
to  exponentiation  in  any  abelian  group  and  not  just  the  integers  modulo  n.  Hence,  we  can  use 
these  algorithms  to  compute  ad  in  a  finite  held  or  to  compute  [d\P  on  an  elliptic  curve;  in  the  latter 
case  we  call  this  point  multiplication  rather  than  exponentiation. 

An  advantage  with  elliptic  curve  variants  is  that  negation  comes  for  free,  in  that  given  P  it  is 
easy  to  compute  —P.  This  leads  to  the  use  of  signed  binary  and  signed  window  methods.  We  only 
present  the  signed  window  method.  We  precompute 


P  for  z  =  1,3,  5, 


)W- 


- 1, 


which  requires  only  half  the  storage  of  the  equivalent  sliding  window  method  or  one  quarter  of  the 
storage  of  the  equivalent  standard  window  method.  We  now  write  our  multiplicand  d  as 

i 

d  =  Y,di-  2ei 
2=0 

where  d{  E  {±1,  ±3,  ±5, . . . ,  ±(2W~1  —  1)}.  The  signed  sliding  window  method  for  elliptic  curves  is 
then  given  by  Algorithm  6.5. 


Algorithm  6.5:  Signed  sliding  window  method 

Q  <-  0. 

for  i  =  l  downto  0  do 

for  j  =  0  to  —  ei  —  1  do  Q  <—  [2 }Q. 
j  <—  di. 

if  j  >  0  then  Q  Q  +  Pj. 
else  Q  <—  Q  —  P~j. 

for  j  —  0  to  eo  —  1  do  Q  [2 \Q. 


6.3.  Special  Exponentiation  Methods 

To  speed  up  public  key  algorithms  even  more  in  practice,  various  tricks  are  used,  the  precise  one 
depending  on  whether  we  are  performing  an  operation  with  a  public  exponent  or  a  private  exponent. 

6.3.1.  Small  Exponents:  When  we  compute  y  =  xe  (mod  n)  where  the  evaluator  does  not  know 
the  factors  of  n,  but  knows  the  exponent  e,  we  often  select  e  to  be  very  small  (this  does  not  seem  to 
create  any  major  attacks),  for  example  e  =  3, 17  or  65  537.  The  reason  for  these  particular  values 
is  that  they  have  small  Hamming  weight,  in  fact  the  smallest  possible  for  a  non-trivial  exponent, 
namely  two.  This  means  that  the  binary  method,  or  any  other  exponentiation  algorithm,  will 
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require  only  one  general  multiplication,  but  it  will  still  need  k  squarings  where  k  is  the  bit  size  of 
the  exponent  e.  For  example 

s  2 

ry» u  ry* "  \/  ry 

v. 1 7  _  ~16  v 

t(y  tL  /\  iT  ^ 

=  (((A)2)2)2  X  X. 

6.3.2.  Knowing  p  and  q:  In  the  case  of  RSA  decryption,  or  signing,  the  exponent  will  be  a 
general  and  secret  2000-bit  number.  Hence,  we  need  some  way  of  speeding  up  the  computation. 
Luckily,  since  we  are  considering  a  private  key  operation  we  have  access  to  the  prime  factors  of  n, 


n  —  p  •  q. 


Suppose  we  wish  to  compute 

y  =  xd  (mod  n). 

We  speed  up  the  calculation  by  first  computing  y  modulo  p  and  q: 


Up 

Uq 


xd  (mod  p)  =  xd  (mod  p  ^  (mod  p), 
xd  (mod  q)  =  xd  (mod  9”1)  (mod?). 


Since  p  and  q  are  1024-bit  numbers,  the  above  calculation  requires  two  exponentiations  modulo 
1024-bit  moduli  and  1024-bit  exponents.  This  is  faster  than  a  single  exponentiation  modulo  a 
2048-bit  number  with  a  2048-bit  exponent. 

But  we  now  need  to  recover  y  from  yp  and  yq,  which  is  done  using  the  Chinese  Remainder 
Theorem  as  follows:  We  compute  t  =  p~l  (mod  q)  and  store  it  with  the  values  p  and  q.  The  value 
y  can  then  be  recovered  from  yp  and  yq  via 

•  u  =  (yq  -  yp)  ■  t  (mod  q), 

•  y  =  yP  +  u  ■  p. 

This  is  why  later  on  in  Chapter  15  we  say  that  when  you  generate  a  private  key  it  is  best  to  store 
p  and  q  even  though  they  are  not  mathematically  needed. 


6.3.3.  Multi-exponentiation:  Sometimes  we  need  to  compute 

r  =  ga  *  yb  (mod  n). 

This  can  be  accomplished  by  first  computing  ga  and  then  yh  and  then  multiplying  the  results 
together.  However,  often  it  is  easier  to  perform  the  two  exponentiations  simultaneously.  There  are 
a  number  of  techniques  to  accomplish  this,  using  various  forms  of  window  techniques  etc.  But  all 
are  essentially  based  on  the  following  idea,  called  Shamir’s  trick. 

We  first  compute  the  look-up  table 

Gi  =  gl°  •  yh 

where  i  =  (zi ,  zq)  is  the  binary  representation  of  z,  for  i  =  0, 1,  2,  3.  We  then  compute  an  exponent 
array  from  the  two  exponents  a  and  b.  This  is  a  two-by-t  array,  where  t  is  the  maximum  bit  length 
of  a  and  b.  The  rows  of  this  array  are  the  binary  representation  of  the  exponents  a  and  b.  We  then 
let  7j,  for  j  =  1  ,...,£,  denote  the  integers  whose  binary  representation  is  given  by  the  columns  of 
this  array.  The  exponentiation  is  then  computed  by  setting  r  —  1  and  computing 

r  —  r2  •  Gi 

13 

for  j  =  1  to  t.  As  an  example  suppose  we  wish  to  compute 

11  7 

r  =  g  •  y  , 

hence  we  have  t  =  4.  We  precompute 

Go  =  1,  G\  =  g,  G2  =  y,  Go  =  g  •  y- 
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Since  the  binary  representation  of  11  and  7  is  given 

by 

1  0  1 
0  1  1 

The  integers  I3  then  become 

h  =  1,  h  =  2,  h 

Hence,  the  four  steps  of  our  algorithm  become 

r  =  G1=g, 
r  =  r2  ■  G2  =  g2  ■  y, 
r  =  r2  ■  G3  =  (g4  ■  y 2)  •  (g  ■  y)  =  g5  ■  y3, 
r  =  r2-G3  =  ( g 10  •  y6)  •  (g  ■  y)  =  g 11  •  y7 . 

Note  that  elliptic  curve  analogues  of  Shamir’s  trick  and  its  variants  exist,  which  make  use  of  signed 
representations  for  the  exponent.  We  do  not  give  these  here,  but  leave  them  for  the  interested 
reader  to  investigate. 


by  1011  and  111,  our  exponent  array  is  given 


=  3,  h  =  3. 


6.4.  Multi-precision  Arithmetic 


We  shall  now  explain  how  to  perform  modular  arithmetic  on  2048-bit  numbers.  We  show  how  this 
is  accomplished  using  modern  processors,  and  then  go  on  to  show  why  naive  algorithms  are  usually 
replaced  with  a  special  technique  due  to  Montgomery. 

In  a  cryptographic  application  it  is  common  to  focus  on  a  fixed  length  for  the  integers  in  use,  for 
example  2048  bits  in  an  RSA/DSA  implementation  or  256  bits  for  an  ECC  implementation.  This 
leads  to  different  programming  choices  than  when  we  implement  a  general-purpose  multi-precision 
arithmetic  library.  For  example,  we  no  longer  need  to  worry  so  much  about  dynamic  memory 
allocation,  and  we  can  now  concentrate  on  particular  performance  enhancements  for  the  integer 
sizes  we  are  dealing  with. 

It  is  common  to  represent  all  integers  in  little- wordian  format.  This  means  that  if  a  large  integer 
is  held  in  memory  locations  xo,  xi, . . . ,  xn,  then  xo  is  the  least  significant  word  and  xn  is  the  most 
significant  word.  For  a  64-bit  machine  and  128-bit  numbers  we  would  represent  x  and  y  as  [xo,xi 
and  [yo,yi]  where 

x  =  x\  •  264  +  xo, 

V  =  2/i  •  264  +  yo- 


6.4.1.  Addition:  Most  modern  processors  have  a  carry  flag  which  is  set  by  any  overflow  from  an 
addition  operation.  Also  most  have  a  special  instruction,  usually  called  something  like  addc,  which 
adds  two  integers  together  and  adds  on  the  contents  of  the  carry  flag.  So  if  we  wish  to  add  our  two 
128-bit  integers  given  earlier  then  we  need  to  compute 

z  =  x  +  y  =  Z2  -  2128  +  z\  •  264  +  zo. 

The  values  of  zq  ,  z\  and  z 2  are  then  computed  via 

zO  <-  add  x0,y0 
zl  <-  addc  xl,yl 
z2  <-  addc  0,0 

Note  that  the  value  held  in  z 2  is  at  most  one,  so  the  value  of  z  could  be  a  129-bit  integer.  The 
above  technique  for  adding  two  128-bit  integers  can  clearly  be  scaled  to  adding  integers  of  any  fixed 
length,  and  can  also  be  made  to  work  for  subtraction  of  large  integers. 
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6.4.2.  Schoolbook  Multiplication:  We  now  turn  to  the  next  simplest  arithmetic  operation, 
after  addition  and  subtraction,  namely  multiplication.  Notice  that  two  64-bit  words  multiply 
together  to  form  a  128-bit  result,  and  so  most  modern  processors  have  an  instruction  which  will 
perform  this  operation. 


w\  •  W2  =  ( High ,  Low)  =  (H{w\  •  ^2),  L(w\  •  W2)). 

When  we  use  schoolbook  long  multiplication,  for  our  two  128-bit  numbers,  we  obtain  something 
like 


X\  Xq 

_ x _ vi _ yo_ 

H(xq  ■  yo)  L(x0-yo ) 

H{x 0  •  yi)  L(x0  ■  yi) 

H(xi-yo)  L(x1-y0) 

H(xi-yi)  L(x\  •  y\) 

Then  we  add  up  the  four  rows  to  get  the  answer,  remembering  we  need  to  take  care  of  the  carries. 
This  then  becomes,  for 


z  =  x-y , 

something  like  the  following  pseudo-code 

(zl,z0)  <-  mul  x0,y0 

(z3,z2)  <-  mul  xl,yl 

(h,l)  <-  mul  xl,y0 

zl  <-  add  zl,l 

z2  <-  addc  z2,h 

z3  <-  addc  z3,0 

(h,l)  <-  mul  x0,yl 

zl  <-  add  zl,l 

z2  <-  addc  z2,h 

z3  <-  addc  z3,0 

If  n  denotes  the  bit  size  of  the  integers  we  are  operating  on,  the  above  technique  for  multiplying 
large  integers  together  clearly  requires  0{n2)  bit  operations,  whilst  it  requires  0{n)  bit  operations 
to  add  or  subtract  integers.  It  is  a  natural  question  as  to  whether  one  can  multiply  integers  faster 
than  0(n2). 

6.4.3.  Karatsuba  Multiplication:  One  technique  to  speed  up  multiplication  is  called  Karatsuba 
multiplication.  Suppose  we  have  two  n-bit  integers  x  and  y  that  we  wish  to  multiply.  We  write 
these  integers  as 

x  =  xq  +  2n/2  •  xi, 

y  =  yo  +  2”/2  •  2/2, 

where  0  <  xo,xi,yo,yi  <  2" A  We  then  multiply  x  and  y  by  computing 

A  <-  x0  ■  yo, 

B  <-  (x0  +  x i)  •  ( y0  +  yi), 

C  <-  xi  ■  yi. 
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The  product  x  •  y  is  then  given  by 

C  •  2n  +  (B  -  A  -  C)  •  2n/2  +  A  =  xi  •  yi  •  2n  +  (aq  •  yo  +  xo  •  yi)  •  2n/2  +  xo  •  yo 

=  (x0  +  2n/2  •  xi)  ■  (y0  +  2n/2  •  yx) 

=  x-y. 

Hence,  to  multiply  two  n-bit  numbers  we  require  three  n/2-bit  multiplications,  two  n/2-bit  additions 
and  three  n-bit  additions/subtractions.  If  we  denote  the  cost  of  an  n-bit  multiplication  by  M(n) 
and  the  cost  of  an  n-bit  addition/subtraction  by  A(n),  we  have 

M(n)  =  3  •  M(n/2)  +  2  •  A(n/ 2)  +  3  •  A(n). 

Now  if  we  make  the  approximation  that  A(n)  ~  n  then 

M(n)  ~  3  •  M(nj 2)  +  4  •  n. 

If  the  multiplication  of  the  n/2-bit  numbers  is  accomplished  in  a  similar  fashion  then  to  obtain  the 
final  complexity  of  multiplication  we  solve  the  above  recurrence  relation  to  obtain 

log(3) 

M(n)  ~  9  •  nlos(2)  as  n  — >  oo 
=  9  •  n1'58. 

So  we  obtain  an  algorithm  with  asymptotic  complexity  0(n1-58).  Karatsuba  multiplication  becomes 
faster  than  the  0(n2)  method  for  integers  of  sizes  greater  than  a  few  hundred  bits.  However,  one 
can  do  even  better  for  very  large  integers  since  the  fastest  known  multiplication  algorithm  takes 
time 

O  (n  •  log  n  •  log  log  n) . 

But  neither  this  latter  technique  nor  Karatsuba  multiplication  are  used  in  many  cryptographic 
applications.  The  reason  for  this  will  become  apparent  as  we  discuss  integer  division. 

6.4.4.  Division:  After  having  looked  at  multiplication  we  are  left  with  the  division  operation, 
which  is  the  hardest  of  all  the  basic  algorithms.  After  all  division  is  required  in  order  to  be  able 
to  compute  the  remainder  on  division.  Given  two  large  integers  x  and  y  we  wish  to  be  able  to 
compute  q  and  r  such  that 

x  =  q  •  y  +  t 

where  0  <  r  <  y\  such  an  operation  is  called  a  Euclidean  division.  If  we  write  our  two  integers  x 
and  y  in  the  little-wordian  format 

x  =  (xo,  ...,xn)  and  y  =  (yo,  ...,yt) 

where  the  base  for  the  representation  is  b  =  2W  then  the  Euclidean  division  can  be  performed  by 
Algorithm  6.6.  We  let  u  <^w  v  denote  a  large  integer  u  shifted  to  the  left  by  v  words,  in  other  words 
the  result  of  multiplying  u  by  bv .  As  one  can  see  this  is  a  complex  operation,  hence  one  should  try 
to  avoid  divisions  as  much  as  possible. 

6.4.5.  Montgomery  Arithmetic:  That  division  is  a  complex  operation  means  our  cryptographic 
operations  run  very  slowly  if  we  use  standard  division  operations  as  above.  Virtually  all  of  the  public 
key  systems  we  will  consider  will  make  use  of  arithmetic  modulo  another  number.  What  we  require 
is  the  ability  to  compute  remainders  (i.e.  to  perform  modular  arithmetic)  without  having  to  perform 
any  costly  division  operations.  This  at  first  sight  may  seem  a  state  of  affairs  which  is  impossible  to 
reach,  but  it  can  be  achieved  using  a  special  form  of  arithmetic  called  Montgomery  arithmetic. 

Montgomery  arithmetic  works  by  using  an  alternative  representation  of  integers,  called  the 
Montgomery  representation.  Let  us  fix  some  notation;  we  let  b  denote  2  to  the  power  of  the  word 
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Algorithm  6.6:  Euclidean  division  algorithm 

r  <—  x. 

/*  Cope  with  the  trivial  case  */ 

if  t  >  n  then 

c j  i —  0. 

return. 

i —  0,  S  i —  0. 

/*  Normalize  the  divisor  */ 
while  yt  <  b/ 2  do 

y<— 2-y,r<— 2-r,  s<— s-\-l. 

if  rn+ 1  7^  0  then  n  <—  n  +  1. 

/*  Get  the  most  significant  word  of  the  quotient  */ 
while  r  >  (2/  < ™  -  0)  do 

Qn—t  ^  Qn—t  T  1  • 

_  r  ^  r  -  (y  < ^  n  -  t). 

/*  Deal  with  the  rest  */ 
for  z  =  n  to  t  +  1  do 

if  r‘i  =  yt  then  %_t_i  <- b  -  1. 
else  <-  [(D  *  &  +  n-i)/2/tJ- 

if  t  ^  0  then  hm  <-  yt  •  b  +  z/z-u 
else  4  yi  •  b. 

h  i  Qj—f—  ^  •  h  rn . 

if  z  7^  1  then  l  •  b2  +  H-i  •  b  +  ?N-2- 
else  l  <—  77  •  b2  +  n_i  •  b. 

while  h  >  l  do 

Qi—t—  1  ^  Qi—t—1  1  • 

h  i —  h  —  hm . 

r  <-  r  -  (qi-t- 1  *  2/)  (z  -  t  -  1). 

if  r  <  0  then 

r  <-  r  +  (y  i  -  t  -  1). 

_  Qi-t- 1  Qi-t- 1  —  1- 

/*  Renormalize  */ 

for  z  =  0  to  s  —  1  do  r  T-  r/2. 


size  of  our  computer,  for  example  b  =  264.  To  perform  arithmetic  modulo  N  we  choose  an  integer 
R  which  satisfies 

R  =  bt  >  A, 

for  some  integer  value  of  t.  Now  instead  of  holding  the  value  of  the  integer  x  in  memory,  we  instead 
hold  the  value 

xji  i —  x  •  R  (mod  N). 

Again  this  is  usually  held  in  a  little- wordian  format.  The  value  xr  is  called  the  Montgomery 
representation  of  the  integer  x  (mod  TV).  Adding  two  elements  in  Montgomery  representation  is 
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easy;  see  Algorithm  6.7.  If 


z  —  x  +  y  (mod  N) 

then  given  x  •  R  (mod  N)  and  y  •  R  (mod  N )  we  need  to  compute  z  •  R  (mod  N). 


Algorithm  6.7:  Addition  in  Montgomery  representation 
zr  <—  xr  +  yR. 

if  zr>  N  then  zr  V-  zr  —  N. 


Example:  Let  us  take  a  simple  example  with 

N  =  18  443  759  776  216  676  723, 
b  =  R  =  264  =  18  446  744  073  709  551 616. 

The  following  is  the  map  from  the  normal  to  Montgomery  representation  of  the  integers  1,  2  and  3. 

1  — >  1  •  R  (mod  N)  =  2  984  297  492  874  893, 

2  — >  2  •  R  (mod  N)  =  5  968  594  985  749  786, 

3  — >  3  •  R  (mod  N)  =  8  952  892  478  624  679. 

We  can  now  verify  that  addition  works  since  we  have  in  the  standard  representation 

1  +  2  =  3 

whilst  this  is  mirrored  in  the  Montgomery  representation  as 

2  984  297  492  874  893  +  5  968  594  985  749  786  =  8  952  892  478  624  679  (mod  N) . 

Montgomery  Reduction:  Now  we  look  at  multiplication  in  Montgomery  arithmetic.  If  we  simply 
multiply  two  elements  in  Montgomery  representation  we  will  obtain 

(x  •  R)  •  (y  •  R)  =  x  •  y  •  R2  (mod  N) 

but  we  want  x  •  y  •  R  (mod  N).  Hence,  we  need  to  divide  the  result  of  the  standard  multiplication 
by  R.  Since  R  is  a  power  of  2  we  hope  this  should  be  easy.  The  process  of  computing 

z  =  y/R  (mod  N ) 

given  y  and  the  earlier  choice  of  i?,  is  called  Montgomery  reduction.  We  first  precompute  the 
integer  q  =  1/N  (mod  i?),  which  is  simple  to  perform  with  no  divisions  using  the  binary  Euclidean 
algorithm.  Then,  performing  a  Montgomery  reduction  is  done  using  Algorithm  6.8. 


Algorithm  6.8:  Montgomery  reduction 

u  {—y  •  q)  mod  R. 

z  <—  (y  +  u  •  N)/R. 

if  z  >  N  then  z  ^  z  —  N . 


Note  that  the  reduction  modulo  R  in  the  first  line  is  easy:  we  compute  y  •  q  using  standard 
algorithms,  the  reduction  modulo  R  being  achieved  by  truncating  the  result.  This  latter  trick  works 
since  R  is  a  power  of  b.  The  division  by  R  in  the  second  line  can  also  be  simply  achieved:  since 
y  +  u  •  N  =  0  (mod  R ),  we  simply  shift  the  result  to  the  right  by  t  words,  again  since  R  =  bt. 
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Example:  As  an  example  we  again  take 

N  =  18  443  759  776  216  676  723, 

R  =  b  =  264  =  18  446  744  073  709  551 616. 

We  wish  to  compute  2  •  3  in  Montgomery  representation.  Recall 

2 —  >2-R  (mod  N)  =  5  968  594  985  749  786  =  x, 

3 —  >3  R  (mod  IV)  =  8  952  892478  624679  =  y. 

We  then  compute,  using  a  standard  multiplication  algorithm,  that 

w  =  x  ■  y  =  53  436  189  155  876  232  216  612  898  568  694  =  2  •  3  •  R2. 

We  now  need  to  pass  this  value  of  w  into  our  technique  for  Montgomery  reduction,  so  as  to  find 
the  Montgomery  representation  of  x  •  y.  We  find 

w  =  53  436  189  155  876  232  216  612  898  568  694, 

q=(l/N)  (mod  R)  =  14  241  249  658  089  591  739, 

u  =  —w  •  q  (mod  R)  =  17  905  784  957  249  358, 

z  =  (w  +  u  •  N)/R  =  17  905  784  957  249  358. 

So  the  multiplication  of  x  and  y  in  Montgomery  arithmetic  should  be 

17905  784  957249  358. 

We  can  check  that  this  is  the  correct  value  by  computing 

6  •  R  (mod  N)  =  17  905  784  957  249  358. 

Hence,  we  see  that  Montgomery  arithmetic  allows  us  to  add  and  multiply  integers  modulo  an  integer 
N  without  the  need  for  costly  division  algorithms. 

Optimized  Montgomery  Reduction:  Our  above  method  for  Montgomery  reduction  requires 
two  full  multi-precision  multiplications.  So  to  multiply  two  numbers  in  Montgomery  arithmetic 
we  require  three  full  multi-precision  multiplications.  If  we  are  multiplying  2048-bit  numbers,  this 
means  the  intermediate  results  can  grow  to  be  4096-bit  numbers.  We  would  like  to  do  better,  and 
we  can. 

Suppose  y  is  given  in  little- wordian  format 

y  (2/0?  z/i  j  •  •  •  ->  y2t—2->  y2t— i)* 

Then  a  better  way  to  perform  Montgomery  reduction  is  to  first  precompute  N'  =  —1/N  (mod  5), 
which  is  easy  and  only  requires  operations  on  word-sized  quantities,  and  then  to  execute  Algorithm 
6.9. 


Algorithm  6.9:  Word-oriented  Montgomery  reduction 

*  <-  y- 

for  i  =  0  to  t  —  1  do 

u  <—  (z{  •  N')  mod  b. 
z  z  +  u  •  N  •  bl . 

£  i —  z  j R. 

if  z  >  N  then  z  ^  z  —  N . 


Note  that  since  we  are  reducing  modulo  b  in  the  first  line  of  the  for  loop  we  can  execute  this 
initial  multiplication  using  a  simple  word  multiplication  algorithm.  The  second  step  of  the  for  loop 
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requires  a  shift  by  one  word  (to  multiply  by  b )  and  a  single  word  x  bigint  multiply.  Hence,  we  have 
reduced  the  need  for  large  intermediate  results  in  the  Montgomery  reduction  step. 

Montgomery  Multiplication:  We  can  also  interleave  the  multiplication  with  the  reduction  to 
perform  a  single  loop  to  produce 


Z  —  X  '  Y/R  (mod  N). 

So  if  X  =  x  •  R  and  Y  =  y  •  R  this  will  produce 

Z  =  (x  •  y)  •  R. 

This  procedure  is  called  Montgomery  multiplication  and  allows  us  to  perform  a  multiplication 
in  Montgomery  arithmetic  without  the  need  for  larger  integers,  as  in  Algorithm  6.10.  Whilst 
Montgomery  multiplication  has  complexity  0(n2)  as  opposed  to  the  0(n1-58)  of  Karatsuba  mul¬ 
tiplication,  it  is  still  preferable  to  use  Montgomery  arithmetic  since  it  deals  more  efficiently  with 
modular  arithmetic. 


Algorithm  6.10:  Montgomery  multiplication 
Z  <-  0. 

for  i  =  0  to  t  —  1  do 

u  <—  ((zo  +  Xi  •  To)  •  N')  mod  b. 

_  Zf-  (Z  +  Xi-Y  +  u-N)/b. 

if  Z  >  N  then  Z  <-  Z  -N. 


6.5.  Finite  Field  Arithmetic 

Apart  from  the  integers  modulo  a  large  prime  p  the  other  type  of  finite  held  used  in  cryptography  are 
those  of  characteristic  two.  These  occur  in  the  AES  algorithm  and  in  certain  elliptic  curve  systems. 
In  AES  the  held  is  so  small  that  one  can  use  look-up  tables  or  special  circuits  to  perform  the  basic 
arithmetic  tasks,  so  in  this  section  we  shall  concentrate  on  helds  of  large  degree  over  F2,  such  as 
those  used  for  elliptic  curves.  In  addition  we  shall  concern  ourselves  with  software  implementations 
only.  Fields  of  characteristic  two  can  have  special  types  of  hardware  implementations  based  on 
optimal  normal  bases,  but  we  shall  not  concern  ourselves  with  these. 

Recall  that  to  dehne  a  hnite  held  of  characteristic  two  we  hrst  pick  an  irreducible  polynomial 
f(x)  over  F2  of  degree  n.  The  held  is  dehned  to  be 

F2»  =¥2[x\/f(x), 

i.e.  we  look  at  binary  polynomials  modulo  f(x).  Elements  of  this  held  are  usually  represented  as 
bit  strings,  which  represent  a  binary  polynomial.  For  example  the  bit  string 

101010111 


represents  the  polynomial 

x  ~\~  x  ~\~  x  x  T  x  T  1 . 

Addition  and  subtraction  of  elements  in  F2™  is  accomplished  by  simply  performing  a  bit-wise 
exclusive-or,  written  0,  between  the  two  bitstrings.  Hence,  the  difficult  tasks  are  multiplication 
and  division. 
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6.5.1.  Characteristic-Two  Field  Division:  It  turns  out  that  division,  although  slower  than 
multiplication,  is  easier  to  describe,  so  we  start  with  division.  To  compute  a//3,  where  a,/3  E  F 2 n, 
we  first  compute  /3_1  and  then  perform  the  multiplication  a  •  /3_1.  So  division  is  reduced  to 
multiplication  and  the  computation  of  /3-1.  One  way  of  computing  /3_1  is  to  use  Lagrange’s 
Theorem  which  tells  us,  for  [3  7^  0,  that  we  have 


P 


2n  —  1 


1. 


But  this  means  that 

/3./32n"2  =  l, 

or  in  other  words 

/r1  =  p2n~2  = 

Another  way  of  computing  is  to  use  the  binary  Euclidean  algorithm.  We  take  the  polynomial 
a  =  f  and  the  polynomial  b  which  represents  (3  and  then  perform  Algorithm  6.11,  which  is  a  version 
of  the  binary  Euclidean  algorithm,  where  lsb(fr)  refers  to  the  least  significant  bit  of  b  (in  other  words 
the  coefficient  of  x°). 


Algorithm  6.11:  Inversion  of  b(x)  modulo  f(x) 

B  0,  D  <-  1. 

/*  At  least  one  of  a  and  b  will  have  a  constant  term  on  every  execution  of  the  loop  */ 

while  a  /  0  do 

while  lsb(a)  =  0  do 

Qj  i —  Ci  1 . 

if  lsb(E>)  /  0  then  5u5©/. 

B  <-  B  >  1. 

while  lsb(fr)  =  0  do 

b  <—  b  1. 

if  lsb(.D)  7^  0  then  DuD©/. 

D  <-  D  >  1. 

/*  Now  both  a  and  b  have  a  constant  term  */ 
if  deg(a)  >  d eg(b)  then 
Ci  i —  Ci  ©  b. 

B  <-  B  ©  D. 
else 

b  a  0  b. 

D  <-  D®B. 

return  D. 


6.5.2.  Characteristic- Two  Field  Multiplication:  We  now  turn  to  the  multiplication  opera¬ 
tion.  Unlike  the  case  of  integers  modulo  N  or  p,  where  we  use  a  special  method  of  Montgomery 
arithmetic,  in  characteristic  two  we  have  the  opportunity  to  choose  a  polynomial  f(x)  which  has 
“nice”  properties.  Any  irreducible  polynomial  of  degree  n  can  be  used  to  implement  the  finite  held 
F2™ ,  we  just  need  to  select  the  best  one. 

Almost  always  one  chooses  a  value  of  f(x)  which  is  either  a  trinomial 

f(x)  =  xn  +  xk  +  1 


or  a  pentanomial 


f(x)  =  xn  +  xks  +  xk2  +  xkl  +  1. 
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It  turns  out  that  for  all  fields  of  degree  less  than  10  000  we  can  always  find  such  a  trinomial  or 
pentanomial  to  make  the  multiplication  operation  very  efficient.  Table  6.1  at  the  end  of  this  chapter 
gives  a  list  for  all  values  of  n  between  2  and  500  of  an  example  pentanomial  or  trinomial  which 
defines  the  held  F 2 n.  In  all  cases  where  a  trinomial  exists  we  give  one,  otherwise  we  present  a 
pentanomial. 

Now  to  perform  a  multiplication  of  a  by  /?  we  first  multiply  the  polynomials  representing  a  and 
f3  together  to  form  a  polynomial  7 (t)  of  degree  at  most  2  •  n  —  2.  Then  we  reduce  this  polynomial 
by  taking  the  remainder  on  division  by  the  polynomial  f(x). 

We  show  how  this  remainder  on  division  is  efficiently  performed  for  trinomials,  and  leave  the 
pentanomial  case  for  the  reader.  We  write 

70  =  7i 0*0  •  Xn  +  ~f0(x). 

Hence,  deg(7i(x)),  deg(7o(x))  <  n  —  1.  We  can  then  write,  as  xn  =  xk  +  1  (mod  xn  +  xk  +  1), 

7(x)  (mod  f(x))  =  70 (x)  +  (; xk  +  1)  •  71  (x). 

The  right-hand  side  of  this  equation  can  be  computed  from  the  bit  operations 

S  =  70  ©  71  ©  (71  <C  k). 

Now  5,  as  a  polynomial,  will  have  degree  at  most  n  —  1  +  k.  So  we  need  to  carry  out  this  procedure 
again  by  first  writing 

S(x)  =  S\(x)  •  xn  +  £o(x), 

where  deg(So{x))  <  n  —  1  and  deg(Si(x))  <  k  —  1.  We  then  compute  as  before  that  7  is  equivalent 
to 

(^o®^i®  (5i  <  k). 

This  latter  polynomial  will  have  degree  max(n  —  1,  2 k  —  1),  so  if  we  select  our  trinomial  so  that 

k  <  nj 2, 

then  Algorithm  6.12  will  perform  our  division- wit h-remainder  step.  Let  g  denote  the  polynomial 
of  degree  2  •  n  —  2  that  we  wish  to  reduce  modulo  /,  where  we  assume  a  bit  representation  for  these 
polynomials. 


Algorithm  6.12:  Reduction  of  g  by  a  trinomial 

g\  <-  g  >  n. 
go  <-  g[n  -  1 . .  .0]. 

9  <-  go  ®  gi  0  (gi  <  k). 
gi  <-  g  >  n. 

go  <-  g[n  -  1 . .  .0]. 

9  <-  go  0  gi  0  (gi  <  k). 


So  to  complete  our  description  of  how  to  multiply  elements  in  F2^  we  need  to  explain  how  to 
perform  the  multiplication  of  two  binary  polynomials  of  large  degree  n  —  1.  Again  one  can  use 
a  naive  multiplication  algorithm.  Often  however  one  uses  a  look-up  table  for  multiplication  of 
polynomials  of  degree  less  than  eight,  i.e.  for  operands  which  fit  into  one  byte.  Then  multiplication 
of  larger-degree  polynomials  is  reduced  to  multiplication  of  polynomials  of  degree  less  than  eight 
by  using  a  variant  of  the  standard  long  multiplication  algorithm  from  school.  This  algorithm  will 
have  complexity  0(n2),  where  n  is  the  degree  of  the  polynomials  involved. 

Suppose  we  have  a  routine  which  uses  a  look-up  table  to  multiply  two  binary  polynomials  of 
degree  less  than  eight,  returning  a  binary  polynomial  of  degree  less  than  sixteen.  This  function  we 
denote  by  MultTab(a,  6)  where  a  and  b  are  8-bit  integers  representing  the  input  polynomials.  To 
perform  a  multiplication  of  two  n-bit  polynomials  represented  by  two  n-bit  integers  x  and  y  we 
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perform  Algorithm  6.13,  where  y  8  (resp.  y  C  8)  represents  shifting  to  the  rightmost  (resp. 
leftmost)  by  8  bits. 


Algorithm  6.13:  Multiplication  of  two  n-bit  polynomials  x  and  y  over  F2 

i  i —  0,  a  i —  0. 

while  x  7^  0  do 

^  ^  2/?  J  ^  0* 
while  u  7^  0  do 

w  <—  MultTab(x&255,  tt&255). 
w  <—  w  <C  (8  •  (i  +  j)). 

Qj  i —  Qj  ©  W. 

u  <—  u  8. 

L  j 

x  <—  x  8. 
i  i —  i  -  j-  1. 

return  a. 


6.5.3.  Karatsuba  Multiplication:  Just  as  with  integer  multiplication  one  can  use  a  divide- 
and-conquer  technique  based  on  Karatsuba  multiplication,  which  again  will  have  a  complexity  of 
0(n1-58).  Suppose  the  two  polynomials  we  wish  to  multiply  are  given  by 

a  =  ao  +  ai  •  xn/2, 
b  =  bo  +  b\  •  xn//2, 

where  ao,  ai,  ^0,  ^1  are  polynomials  of  degree  less  than  n/2.  We  then  multiply  a  and  b  by  computing 

A  <—  ao  •  fro? 

B  <—  (ao  +  ai)  •  (60  +  &i)> 

(7  4 —  a\  '  b\ . 

The  product  a  •  b  is  then  given  by 

C  •  xn  +  (B  —  A  —  C)  •  xn//2  +  A  =  a\  •  b\  •  xn  +  (ai  •  £>o  +  ao  •  61)  •  xn/2  +  ao  •  bo 

=  (ao  +  a\  •  xn/2)  •  (bo  +  61  •  xn/2) 

=  a  •  b. 

Again  to  multiply  ao  and  bo  etc.  we  use  the  Karatsuba  multiplication  method  recursively.  Once 
we  reduce  to  the  case  of  multiplying  two  polynomials  of  degree  less  than  eight  we  resort  to  using 
our  look-up  table  to  perform  the  polynomial  multiplication.  Unlike  the  integer  case  we  now  find 
that  Karatsuba  multiplication  is  more  efficient  than  the  schoolbook  method  even  for  polynomials 
of  quite  small  degree,  say  n  ~  32. 

6.5.4.  Squaring  in  Characteristic  Two:  One  should  note  that  squaring  polynomials  in  fields 
of  characteristic  two  is  particularly  easy.  Suppose  we  have  a  polynomial 

2  3 

a  =  ao  +  a\  •  x  cl2  *  x  clo  *  t  , 

where  a^  =  0  or  1.  Then  to  square  a  we  simply  “thin  out”  the  coefficients,  as  2  =  0  (mod  2),  as 
follows: 

O  o  A 

a  —  ao  +  a\  •  x  +  •  x  +  as  •  x . 
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This  means  that  squaring  an  element  in  a  finite  held  of  characteristic  two  is  very  fast  compared 
with  a  multiplication  operation. 


Chapter  Summary 

•  Modular  exponentiation,  or  exponentiation  in  any  group,  can  be  computed  using  the 
binary  exponentiation  method.  Often  it  is  more  efficient  to  use  a  window  based  method, 
or  to  use  a  signed-exponentiation  method  in  the  case  of  elliptic  curves. 

•  There  are  some  special  optimizations  in  various  cases.  In  the  case  of  a  known  exponent 
we  hope  to  choose  one  which  is  both  small  and  has  very  low  Hamming  weight.  For 
exponentiation  by  a  private  exponent  we  use  knowledge  of  the  prime  factorization  of  the 
modulus  and  the  Chinese  Remainder  Theorem. 

•  Simultaneous  exponentiation  is  often  more  efficient  than  performing  two  single  exponen¬ 
tiations  and  then  combining  the  result. 

•  Modular  arithmetic  is  usually  implemented  using  the  technique  of  Montgomery  represen¬ 
tation.  This  allows  us  to  avoid  costly  division  operations  by  replacing  the  division  with 
simple  shift  operations.  This  however  is  at  the  expense  of  using  a  non-standard  represen¬ 
tation  for  the  numbers. 

•  Finite  fields  of  characteristic  two  can  also  be  implemented  efficiently,  but  now  the  modular 
reduction  operation  can  be  made  simple  by  choosing  a  special  polynomial  f(x).  Inversion 
is  also  particularly  simple  using  a  variant  of  the  binary  Euclidean  algorithm,  although 
often  inversion  is  still  three  to  ten  times  slower  than  multiplication. 


Further  Reading 

The  standard  reference  work  for  the  type  of  algorithms  considered  in  this  chapter  is  Volume  2 
of  Knuth.  A  more  gentle  introduction  can  be  found  in  the  book  by  Bach  and  Shallit,  whilst  for 
more  algorithms  one  should  consult  the  book  by  Cohen.  The  first  chapter  of  Cohen  gives  a  number 
of  lessons  learnt  in  the  development  of  the  PARI/GP  calculator  which  can  be  useful,  whilst  Bach 
and  Shallit  provides  an  extensive  bibliography  and  associated  commentary. 

E.  Bach  and  S.  Shallit.  Algorithmic  Number  Theory,  Volume  1:  Efficient  Algorithms.  MIT  Press, 
1996. 

H.  Cohen.  A  Course  in  Computational  Algebraic  Number  Theory.  Springer,  1993. 

D.  Knuth.  The  Art  of  Computer  Programming,  Volume  2:  Seminumerical  Algorithms.  Addison- 
Wesley,  1975. 
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Table  6.1.  Trinomials  and  pentanomials 


n 

k/k\,k2,k3 

n 

k/k\,k2,  k3 

n 

k/ki,k2,  k3 

2 

1 

3 

1 

4 

1 

5 

2 

6 

1 

7 

1 

8 

7,3,2 

9 

1 

10 

3 

11 

2 

12 

3 

13 

4,3,1 

14 

5 

15 

1 

16 

5,3,1 

17 

3 

18 

3 

19 

5,2,1 

20 

3 

21 

2 

22 

1 

23 

5 

24 

8,3,2 

25 

3 

26 

4,3,1 

27 

5,2,1 

28 

1 

29 

2 

30 

1 

31 

3 

32 

7,3,2 

33 

10 

34 

7 

35 

2 

36 

9 

37 

6,4,1 

38 

6,5,1 

39 

4 

40 

5,4,3 

41 

3 

42 

7 

43 

6,4,3 

44 

5 

45 

4,3,1 

46 

1 

47 

5 

48 

11,5,1 

49 

9 

50 

4,3,2 

51 

6,3,1 

52 

3 

53 

6,2,1 

54 

9 

55 

7 

56 

7,4,2 

57 

4 

58 

19 

59 

7,4,2 

60 

1 

61 

5,2,1 

62 

29 

63 

1 

64 

11,2,1 

65 

32 

66 

3 

67 

5,2,1 

68 

33 

69 

6,5,2 

70 

37,34,33 

71 

35 

72 

36,35,33 

73 

42 

74 

35 

75 

35,34,32 

76 

38,33,32 

77 

38,33,32 

78 

41,37,32 

79 

40,36,32 

80 

45,39,32 

81 

35 

82 

43,35,32 

83 

39,33,32 

84 

35 

85 

35,34,32 

86 

49,39,32 

87 

46,34,32 

88 

45,35,32 

89 

38 

90 

35,34,32 

91 

41,33,32 

92 

37,33,32 

93 

35,34,32 

94 

43,33,32 

95 

41,33,32 

96 

57,38,32 

97 

33 

98 

63,35,32 

99 

42,33,32 

100 

37 

101 

40,34,32 

102 

37 

103 

72 

104 

43,33,32 

105 

37 

106 

73,33,32 

107 

54,33,32 

108 

33 

109 

34,33,32 

110 

33 

111 

49 

112 

73,51,32 

113 

37,33,32 

114 

69,33,32 

115 

53,33,32 

116 

48,33,32 

117 

78,33,32 

118 

33 
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113 


119 

38 

122 

39,34,32 

125 

79,33,32 

128 

55,33,32 

131 

43,33,32 

134 

57 

137 

35 

140 

45 

143 

36,33,32 

146 

71 

149 

64,34,32 

152 

35,33,32 

155 

62 

158 

76,33,32 

161 

39 

164 

42,33,32 

167 

35 

170 

105,35,32 

173 

71,33,32 

176 

79,37,32 

179 

80,33,32 

182 

81 

185 

41 

188 

46,33,32 

191 

51 

194 

87 

197 

38,33,32 

200 

57,35,32 

203 

68,33,32 

206 

37,33,32 

209 

45 

212 

105 

215 

51 

218 

71 

221 

63,33,32 

224 

39,33,32 

227 

81,33,32 

230 

50,33,32 

233 

74 

236 

50,33,32 

239 

36 

242 

95 

245 

87,33,32 

120 

41,35,32 

123 

42,33,32 

126 

49 

129 

46 

132 

44,33,32 

135 

39,33,32 

138 

57,33,32 

141 

85,35,32 

144 

59,33,32 

147 

49 

150 

53 

153 

71,33,32 

156 

57 

159 

34 

162 

63 

165 

35,33,32 

168 

134,33,32 

171 

125,34,32 

174 

57 

177 

88 

180 

33 

183 

56 

186 

79 

189 

37,34,32 

192 

147,33,32 

195 

50,34,32 

198 

65 

201 

59 

204 

99 

207 

43 

210 

49,35,32 

213 

75,33,32 

216 

115,34,32 

219 

54,33,32 

222 

102,33,32 

225 

32 

228 

113 

231 

34 

234 

103 

237 

80,34,32 

240 

177,35,32 

243 

143,34,32 

246 

62,33,32 

121 

35,34,32 

124 

37 

127 

63 

130 

61,33,32 

133 

46,33,32 

136 

35,33,32 

139 

38,33,32 

142 

71,33,32 

145 

52 

148 

61,33,32 

151 

39 

154 

109,33,32 

157 

47,33,32 

160 

79,33,32 

163 

48,34,32 

166 

37 

169 

34 

172 

81 

175 

57 

178 

87 

181 

46,33,32 

184 

121,39,32 

187 

37,33,32 

190 

47,33,32 

193 

73 

196 

33 

199 

34 

202 

55 

205 

94,33,32 

208 

119,34,32 

211 

175,33,32 

214 

73 

217 

45 

220 

33 

223 

33 

226 

59,34,32 

229 

64,35,32 

232 

191,33,32 

235 

34,33,32 

238 

73 

241 

70 

244 

111 

247 

82 
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248 

155,33,32 

251 

130,33,32 

254 

85,33,32 

257 

41 

260 

35 

263 

93 

266 

47 

269 

207,33,32 

272 

165,35,32 

275 

81,33,32 

278 

70,33,32 

281 

93 

284 

53 

287 

71 

290 

81,33,32 

293 

94,33,32 

296 

87,33,32 

299 

147,33,32 

302 

41 

305 

102 

308 

40,33,32 

311 

78,33,32 

314 

79,33,32 

317 

36,34,32 

320 

135,34,32 

323 

56,33,32 

326 

65,33,32 

329 

50 

332 

89 

335 

113,33,32 

338 

86,35,32 

341 

126,33,32 

344 

135,34,32 

347 

56,33,32 

350 

53 

353 

69 

356 

112,33,32 

359 

68 

362 

63 

365 

303,33,32 

368 

283,34,32 

371 

116,33,32 

374 

42,33,32 

249 

35 

252 

33 

255 

52 

258 

71 

261 

89,34,32 

264 

179,33,32 

267 

42,33,32 

270 

53 

273 

53 

276 

63 

279 

38 

282 

35 

285 

50,33,32 

288 

111,33,32 

291 

168,33,32 

294 

33 

297 

83 

300 

45 

303 

36,33,32 

306 

66,33,32 

309 

107,33,32 

312 

87,33,32 

315 

132,33,32 

318 

45 

321 

41 

324 

51 

327 

34 

330 

99 

333 

43,34,32 

336 

267,33,32 

339 

72,33,32 

342 

125 

345 

37 

348 

103 

351 

34 

354 

99 

357 

76,34,32 

360 

323,33,32 

363 

74,33,32 

366 

38,33,32 

369 

91 

372 

111 

375 

64 

250 

103 

253 

46 

256 

91,33,32 

259 

113,33,32 

262 

86,33,32 

265 

42 

268 

61 

271 

58 

274 

67 

277 

91,33,32 

280 

242,33,32 

283 

53,33,32 

286 

69 

289 

36 

292 

37 

295 

48 

298 

61,33,32 

301 

83,33,32 

304 

203,33,32 

307 

46,33,32 

310 

93 

313 

79 

316 

63 

319 

36 

322 

67 

325 

46,33,32 

328 

195,37,32 

331 

172,33,32 

334 

43,33,32 

337 

55 

340 

45 

343 

75 

346 

63 

349 

182,34,32 

352 

147,34,32 

355 

43,33,32 

358 

57 

361 

56,33,32 

364 

67 

367 

171 

370 

139 

373 

299,33,32 

376 

227,33,32 
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377 

41 

380 

47 

383 

90 

386 

83 

389 

275,33,32 

392 

71,33,32 

395 

301,33,32 

398 

122,33,32 

401 

152 

404 

65 

407 

71 

410 

87,33,32 

413 

199,33,32 

416 

287,38,32 

419 

200,33,32 

422 

149 

425 

42 

428 

105 

431 

120 

434 

55,33,32 

437 

40,34,32 

440 

63,33,32 

443 

221,33,32 

446 

105 

449 

134 

452 

97,33,32 

455 

38 

458 

203 

461 

194,35,32 

464 

143,33,32 

467 

156,33,32 

470 

149 

473 

200 

476 

129 

479 

104 

482 

48,35,32 

485 

267,33,32 

488 

79,33,32 

491 

61,33,32 

494 

137 

497 

78 

500 

75 

378 

43 

381 

107,34,32 

384 

295,34,32 

387 

162,33,32 

390 

49 

393 

62 

396 

51 

399 

49 

402 

171 

405 

182,33,32 

408 

267,33,32 

411 

122,33,32 

414 

53 

417 

107 

420 

45 

423 

104,33,32 

426 

63 

429 

83,33,32 

432 

287,34,32 

435 

236,33,32 

438 

65 

441 

35 

444 

81 

447 

73 

450 

47 

453 

87,33,32 

456 

67,34,32 

459 

68,33,32 

462 

73 

465 

59 

468 

33 

471 

119 

474 

191 

477 

150,33,32 

480 

169,35,32 

483 

288,33,32 

486 

81 

489 

83 

492 

50,33,32 

495 

76 

498 

155 

379 

44,33,32 

382 

81 

385 

51 

388 

159 

391 

37,33,32 

394 

135 

397 

161,34,32 

400 

191,33,32 

403 

79,33,32 

406 

141 

409 

87 

412 

147 

415 

102 

418 

199 

421 

191,33,32 

424 

213,34,32 

427 

62,33,32 

430 

62,33,32 

433 

33 

436 

165 

439 

49 

442 

119,33,32 

445 

146,33,32 

448 

83,33,32 

451 

406,33,32 

454 

128,33,32 

457 

61 

460 

61 

463 

93 

466 

143,33,32 

469 

116,34,32 

472 

47,33,32 

475 

134,33,32 

478 

121 

481 

138 

484 

105 

487 

94 

490 

219 

493 

266,33,32 

496 

43,33,32 

499 

40,33,32 

Part  2 

Historical  Ciphers 


In  this  part  we  discuss  some  historical  ciphers;  those  who  are  interested  in  pressing  on  with 
modern  cryptography  should  jump  straight  to  Part  3.  However,  discussing  the  construction  of 
historical  ciphers  and  how  they  were  broken  enables  one  to  get  a  view  of  how  modern  cryptosystems 
came  to  be  designed  as  they  are.  For  example,  modern  block  ciphers  are  built  out  of  two  key 
primitives,  substitution  and  permutation,  both  of  which  occur  in  the  construction  of  historical 
ciphers. 

Encryption  of  most  data  today  is  accomplished  using  fast  block  and  stream  ciphers.  These  are 
examples  of  symmetric  encryption  algorithms.  In  addition  all  historical,  i.e.  pre-1960,  ciphers  are 
symmetric  in  nature  and  share  some  design  principles  with  modern  ciphers.  The  main  drawback 
of  symmetric  ciphers  is  that  they  give  rise  to  the  problem  of  how  to  distribute  the  secret  keys,  a 
problem  which  resulted  in  the  Allied  breaks  of  Enigma  and  Lorenz  during  World  War  II,  which  we 
discuss  in  this  part. 


CHAPTER  7 


Historical  Ciphers 


Chapter  Goals 

•  To  explain  a  number  of  historical  ciphers,  such  as  the  Caesar  cipher  and  the  substitution 
cipher. 

•  To  show  how  these  historical  ciphers  can  be  broken  because  they  do  not  hide  the  underlying 
statistics  of  the  plaintext. 

•  To  introduce  the  concepts  of  substitution  and  permutation  as  basic  cipher  components. 

•  To  introduce  a  number  of  attack  techniques,  such  as  chosen  plaintext  attacks. 

7.1.  Introduction 

An  encryption  algorithm,  or  cipher,  is  a  means  of  transforming  plaintext  into  ciphertext  under  the 
control  of  a  secret  key.  This  process  is  called  encryption  or  encipherment.  We  write 

c  =  ek(m), 

where 

•  m  is  the  plaintext, 

•  e  is  the  cipher  function, 

•  /c  is  the  secret  key, 

•  c  is  the  ciphertext. 

The  reverse  process  is  called  decryption  or  decipherment,  and  we  write 

rn  =  dk(c). 

Note  that  the  encryption  and  decryption  algorithms  e,  d  are  public:  the  secrecy  of  m  given  c 
depends  totally  on  the  secrecy  of  k. 

The  above  process  requires  that  each  party  needs  access  to  the  secret  key.  The  key  needs  to  be 
known  to  both  sides,  but  needs  to  be  kept  secret.  Encryption  algorithms  which  have  this  property 
are  called  symmetric  cryptosystems  or  secret  key  cryptosystems.  There  is  a  form  of  cryptography 
which  uses  two  different  types  of  key;  one  is  publicly  available  and  used  for  encryption  whilst  the 
other  is  private  and  used  for  decryption.  These  latter  types  of  cryptosystems  are  called  asymmetric 
cryptosystems  or  public  key  cryptosystems ,  and  we  shall  return  to  them  in  a  later  chapter. 

Usually  in  cryptography  the  communicating  parties  are  denoted  by  A  and  B.  However,  often 
one  uses  the  more  user-friendly  names  of  Alice  and  Bob.  But  you  should  not  assume  that  the 
parties  are  necessarily  human;  we  could  be  describing  a  communication  being  carried  out  between 
two  autonomous  machines.  The  eavesdropper,  bad  girl,  adversary  or  attacker  is  usually  given  the 
name  Eve. 

In  this  chapter  we  shall  present  some  historical  ciphers  which  were  used  in  the  pre-computer 
age  to  encrypt  data.  We  shall  show  that  these  ciphers  are  easy  to  break  as  soon  as  one  understands 
the  statistics  of  the  underlying  language,  in  our  case  English.  In  Chapter  9  we  shall  study  this 
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relationship  between  how  easy  the  cipher  is  to  break  and  the  statistical  distribution  of  the  underlying 
plaintext  in  more  detail. 


Letter 

Freq.  (%) 

Letter 

Freq.  (%) 

A 

8.2 

N 

6.7 

B 

1.5 

0 

7.5 

c 

2.8 

p 

1.9 

D 

4.2 

Q 

0.1 

E 

12.7 

R 

6.0 

F 

2.2 

s 

6.3 

G 

2.0 

T 

9.0 

H 

6.1 

U 

2.8 

I 

7.0 

V 

1.0 

J 

0.1 

W 

2.4 

K 

0.8 

X 

0.1 

L 

4.0 

Y 

2.0 

M 

2.4 

Z 

0.1 

Table  7.1.  English  letter  frequencies 


Figure  7.1.  English  letter  frequencies 


The  distribution  of  English  letter  frequencies  is  described  in  Table  7.1,  or  graphically  in  Fig¬ 
ure  7.1.  As  one  can  see,  the  most  common  letters  are  E  and  T.  It  often  helps  to  know  second-order 
statistics  about  the  underlying  language,  such  as  which  are  the  most  common  sequences  of  two 
or  three  letters,  called  bigrams  and  trigrams.  The  most  common  bigrams  in  English  are  given 
by  Table  7.2,  with  the  associated  approximate  frequencies.  The  most  common  trigrams  are,  in 
decreasing  order, 

THE,  ING,  AND,  HER,  ERE,  ENT,  THA,  NTH,  WAS,  ETH,  FOR. 

Armed  with  this  information  about  English  we  are  now  able  to  examine  and  break  a  number  of 
historical  ciphers. 


7.2.  Shift  Cipher 

We  first  present  one  of  the  earliest  ciphers,  called  the  shift  cipher.  Encryption  is  performed  by 
replacing  each  letter  by  the  letter  located  a  certain  number  of  places  further  on  in  the  alphabet. 
So  for  example  if  the  key  was  three,  then  the  plaintext  A  would  be  replaced  by  the  ciphertext  D, 
the  letter  B  would  be  replaced  by  E  and  so  on.  The  plaintext  word  HELLO  would  be  encrypted  as 
the  ciphertext  KHOOR.  When  this  cipher  is  used  with  the  key  three,  it  is  often  called  the  Caesar 
cipher,  although  in  many  books  the  name  Caesar  cipher  is  sometimes  given  to  the  shift  cipher  with 
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Bigram 

Freq.  (%) 

Bigram 

Freq.  (%) 

TH 

3.15 

HE 

2.51 

AN 

1.72 

IN 

1.69 

ER 

1.54 

RE 

1.48 

ES 

1.45 

ON 

1.45 

EA 

1.31 

TI 

1.28 

AT 

1.24 

ST 

1.21 

EN 

1.20 

ND 

1.18 

Table  7.2.  English  bigram  frequencies 


any  key.  Strictly  this  is  not  correct  since  we  only  have  evidence  that  Julius  Caesar  used  the  cipher 
with  the  key  three. 

There  is  a  more  mathematical  explanation  of  the  shift  cipher  which  will  be  instructive  for  future 
discussions.  First  we  need  to  identify  each  letter  of  the  alphabet  with  a  number.  It  is  usual  to 
identify  the  letter  A  with  the  number  0,  the  letter  B  with  number  1,  the  letter  C  with  the  number  2 
and  so  on  until  we  identify  the  letter  Z  with  the  number  25.  After  we  convert  our  plaintext  message 
into  a  sequence  of  numbers,  the  ciphertext  in  the  shift  cipher  is  obtained  by  adding  to  each  number 
the  secret  key  k  modulo  26,  where  the  key  is  a  number  in  the  range  0  to  25.  In  this  way  we  can 
interpret  the  shift  cipher  as  a  stream  cipher ,  with  keystream  given  by  the  repeating  sequence 

Is*  Is*  Is*  Is*  Is* 

lb  j  lb  ^  lb  j  lb  j  lb  j  lb  j  •  .  . 

This  keystream  is  not  very  random,  which  results  in  it  being  easy  to  break  the  shift  cipher.  A  naive 
way  of  breaking  the  shift  cipher  is  to  simply  try  each  of  the  possible  keys  in  turn,  until  the  correct 
one  is  found.  There  are  only  26  possible  keys  so  the  time  for  this  exhaustive  key  search  is  very 
small,  particularly  if  it  is  easy  to  recognize  the  underlying  plaintext  when  it  is  decrypted. 

We  shall  show  how  to  break  the  shift  cipher  by  using  the  statistics  of  the  underlying  language. 
Whilst  this  is  not  strictly  necessary  for  breaking  this  cipher,  later  we  shall  see  a  cipher  that  is  made 
up  of  a  number  of  shift  ciphers  applied  in  turn  and  then  the  following  statistical  technique  will  be 
useful.  Using  a  statistical  technique  on  the  shift  cipher  is  also  instructive  as  to  how  statistics  of  the 
underlying  plaintext  can  arise  in  the  resulting  ciphertext.  Take  the  following  example  ciphertext, 
which  since  it  is  public  knowledge  we  represent  in  blue. 

GB  OR,  BE  ABG  GB  OR:  GUNG  VF  GUR  DHRFGVBA: 

JURGURE  ’GVF  ABOYRE  VA  GUR  ZVAQ  GB  FHSSRE 
GUR  FYVATF  NAQ  NEEBJF  BS  BHGENTRBHF  SBEGHAR, 

BE  GB  GNXR  NEZF  NTNVAFG  N  FRN  BS  GEBHOYRF, 

NAQ  OL  BCCBFVAT  RAQ  GURZ?  GB  QVR:  GB  FYRRC; 

AB  ZBER;  NAQ  OL  N  FYRRC  GB  FNL  JR  RAQ 
GUR  URNEG-NPUR  NAQ  GUR  GUBHFNAQ  ANGHENY  FUBPXF 
GUNG  SYRFU  VF  URVE  GB,  ’GVF  N  PBAFHZZNGVBA 
QRIBHGYL  GB  OR  JVFU’Q.  GB  QVR,  GB  FYRRC; 

GB  FYRRC:  CREPUNAPR  GB  QERNZ:  NL,  GURER’F  GUR  EHO; 

SBE  VA  GUNG  FYRRC  BS  QRNGU  JUNG  QERNZF  ZNL  PBZR 
JURA  JR  UNIR  FUHSSYRQ  BSS  GUVF  ZBEGNY  PBVY, 

ZHFG  TVIR  HF  CNHFR:  GURER’F  GUR  ERFCRPG 
GUNG  ZNXRF  PNYNZVGL  BS  FB  YBAT  YVSR; 

One  technique  used  in  breaking  the  previous  sample  ciphertext  is  to  notice  that  the  ciphertext 
still  retains  details  about  the  word  lengths  of  the  underlying  plaintext.  For  example  the  ciphertext 
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letter  N  appears  as  a  single  letter  word.  Since  the  only  common  single-letter  words  in  English 
are  A  and  I  we  can  conclude  that  the  key  is  either  13,  since  N  is  thirteen  letters  on  from  A  in 
the  alphabet,  or  5,  since  N  is  five  letters  on  from  I  in  the  alphabet.  Hence,  the  moral  here  is  to 
always  remove  word  breaks  from  the  underlying  plaintext  before  encrypting  using  the  shift  cipher. 
But  even  if  we  ignore  this  information  about  the  word  break,  we  can  still  break  this  cipher  using 
frequency  analysis. 

We  compute  the  frequencies  of  the  letters  in  the  ciphertext  and  compare  them  with  the  fre¬ 
quencies  obtained  from  English  which  we  saw  in  Figure  7.1.  We  present  the  two  bar  graphs  one 
above  each  other  in  Figure  7.2  so  you  can  see  that  one  graph  looks  almost  like  a  shift  of  the  other 
graph.  The  statistics  obtained  from  the  sample  ciphertext  are  given  in  blue,  whilst  the  statistics 
obtained  from  the  underlying  plaintext  language  are  given  in  red.  Note,  we  do  not  compute  the 
red  statistics  from  the  actual  plaintext  since  we  do  not  know  this  yet,  we  only  make  use  of  the 
knowledge  of  the  underlying  language. 


Figure  7.2.  Comparison  of  plaintext  and  ciphertext  frequencies  for  the  shift  cipher  example 


By  comparing  the  two  bar  graphs  in  Figure  7.2  we  can  see  by  how  much  we  think  the  blue 
graph  has  been  shifted  compared  with  the  red  graph.  By  examining  where  we  think  the  plaintext 
letter  E  may  have  been  shifted,  one  can  hazard  a  guess  that  it  is  shifted  by  one  of 


2,  9, 13  or  23. 

Then  by  trying  to  deduce  by  how  much  the  plaintext  letter  A  has  been  shifted  we  can  guess  that 
it  has  been  shifted  by  one  of 

1,6,13  or  17. 

The  only  shift  value  which  is  consistent  appears  to  be  the  value  13,  and  we  conclude  that  this  is 
the  most  likely  key  value. 

One  may  ask  whether  there  is  a  more  scientific  way  of  performing  the  above  comparison  of  bar 
graphs.  Indeed  there  is,  using  something  called  the  statistical  distance.  Let  X  and  Y  be  random 
variables  distributed  according  to  distributions  D\  and  ;  we  let  V  denote  the  support  of  X  and 
Y  (i.e.  the  set  of  values  which  can  occur  for  X  or  Y  with  non-zero  probability).  We  then  define 
the  statistical  distance  (actually  the  total  variation  distance ,  as  there  are  many  different  statistical 
distances  one  can  define)  by 

A[x.y]  =  W 

uev 


Pr  \X  =  u\  —  Pr  \Y  =  u 

X^D  i  Y^D2 
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To  apply  this  to  our  example  we  let  X  denote  the  probabilities  of  letters  occuring  in  English,  i.e.  the 
probabilities  from  Table  7.1,  and  we  let  Y&  denote  the  probabilities  obtained  from  the  ciphertext 
but  shifted  by  the  key  value  k.  So  we  have  twenty-six  different  distributions  Y&  to  compare  to  the 
fixed  distribution  X.  The  one  which  has  the  smallest  distance  is  the  one  most  likely  to  be  the  key. 

Applying  this  method  in  this  example  we  find  the  statistical  distances  given  in  Table  7.3.  The 
value  for  the  key  13  is  significantly  smaller  than  the  values  for  the  other  keys;  thus  we  can  conclude 
(using  this  more  scientific  method)  that  the  key  is  13. 


k 

A(X,  Yk) 

k 

Apr,  Yk) 

0 

48.4 

13 

10.8 

1 

44.6 

14 

44.8 

2 

44.0 

15 

57.0 

3 

49.5 

16 

55.3 

4 

53.2 

17 

47.0 

5 

52.9 

18 

48.5 

6 

46.0 

19 

49.1 

7 

53.9 

20 

45.3 

8 

52.7 

21 

56.4 

9 

43.8 

22 

51.6 

10 

51.3 

23 

47.5 

11 

56.8 

24 

43.8 

12 

46.7 

25 

45.2 

Table  7.3.  Statistical  distance  between  X  and  Yju  for  the  shift  cipher  example 


We  can  now  decrypt  the  ciphertext,  using  this  key.  This  reveals  that  the  underlying  plaintext 
is  the  following  text  from  Shakespeare’s  Hamlet : 

To  be,  or  not  to  be:  that  is  the  question: 

Whether  ’tis  nobler  in  the  mind  to  suffer 
The  slings  and  arrows  of  outrageous  fortune, 

Or  to  take  arms  against  a  sea  of  troubles, 

And  by  opposing  end  them?  To  die:  to  sleep; 

No  more;  and  by  a  sleep  to  say  we  end 
The  heart-ache  and  the  thousand  natural  shocks 
That  flesh  is  heir  to,  ’tis  a  consummation 
Devoutly  to  be  wish’d.  To  die,  to  sleep; 

To  sleep:  perchance  to  dream:  ay,  there’s  the  rub; 

For  in  that  sleep  of  death  what  dreams  may  come 
When  we  have  shuffled  off  this  mortal  coil, 

Must  give  us  pause:  there’s  the  respect 
That  makes  calamity  of  so  long  life; 

7.3.  Substitution  Cipher 

The  main  problem  with  the  shift  cipher  is  that  the  number  of  keys  is  too  small;  we  only  have  26 
possible  keys.  To  increase  the  number  of  keys  the  substitution  cipher  was  invented.  To  write  down 
a  key  for  the  substitution  cipher  we  first  write  down  the  alphabet,  and  then  a  permutation  of  the 
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alphabet  directly  below  it.  This  mapping  gives  the  substitution  we  make  between  the  plaintext 
and  the  ciphertext 

Plaintext  alphabet  ABCDEFGHI JKLMNOPQRSTUVWXYZ 
Ciphertext  alphabet  GO  YDS  IPELUAVCR  JWXZNHBQFTMK 

Encryption  involves  replacing  each  letter  in  the  top  row  by  its  value  in  the  bottom  row.  Decryption 
involves  first  looking  for  the  letter  in  the  bottom  row  and  then  seeing  which  letter  in  the  top  row 
maps  to  it.  Hence,  the  plaintext  word  HELLO  would  encrypt  to  the  ciphertext  ESVVJ  if  we  used 
the  substitution  given  above. 

The  number  of  possible  keys  is  equal  to  the  total  number  of  permutations  on  26  letters,  namely 
the  size  of  the  group  $26,  which  is 

26!  «  4.03  •  1026  «  288. 

Since,  as  a  rule  of  thumb,  it  is  only  feasible  to  run  a  computer  on  a  problem  which  takes  under  280 
steps  we  can  deduce  that  this  large  key  space  is  far  too  large  to  enable  a  brute  force  search  even 
using  a  modern  computer.  Despite  this  we  can  break  substitution  ciphers  using  statistics  of  the 
underlying  plaintext  language,  just  as  we  did  for  the  shift  cipher. 

Whilst  the  shift  cipher  can  be  considered  as  a  stream  cipher  since  the  ciphertext  is  obtained 
from  the  plaintext  by  combining  it  with  a  keystream,  the  substitution  cipher  operates  much  more 
like  a  modern  block  cipher,  with  a  block  length  of  one  English  letter.  A  ciphertext  block  is  obtained 
from  a  plaintext  block  by  applying  some  (admittedly  simple  in  this  case)  key-dependent  algorithm. 

Substitution  ciphers  are  the  ciphers  commonly  encountered  in  puzzle  books;  they  have  an 
interesting  history  and  have  occurred  many  times  in  literature.  See  for  example  the  Sherlock 
Holmes  story  The  Adventure  of  the  Dancing  Men  by  Arthur  Conan  Doyle;  the  plot  of  this  story 
rests  on  a  substitution  cipher  where  the  ciphertext  characters  are  taken  from  an  alphabet  of  “stick 
men”  in  various  positions.  The  method  of  breaking  the  cipher  as  described  by  Holmes  to  Watson 
in  this  story  is  precisely  the  method  we  shall  adopt  below. 

Example:  We  give  a  detailed  example,  which  we  make  slightly  easier  by  keeping  in  the  ciphertext 
details  about  the  underlying  word  spacing  used  in  the  plaintext.  This  is  only  for  ease  of  exposition; 
the  techniques  we  describe  can  still  be  used  if  we  ignore  these  word  spacings,  although  more  care 
and  thought  is  required.  Consider  the  ciphertext 

XSO  MJIWXVL  JODIVA  STW  VAO  VY  OZJVCO’W  LTJDOWX  KVAKOAXJTXIVAW  VY 
SIDS  XOKSAVLVDQ  IAGZWXJQ.  KVUCZXOJW,  KVUUZAIKTXIVAW  TAG  UIKJVOLOKXJ- 
VAIKW  TJO  HOLL  JOCJOWOAXOG,  TLVADWIGO  GIDIXTL  UOGIT,  KVUCZXOJ  DTUOW 
TAG  OLOKXJVAIK  KVUUOJKO.  TW  HOLL  TW  SVWXIAD  UTAQ  JOWOTJKS  TAG 
CJVGZKX  GONOLVCUOAX  KOAXJOW  VY  UTPVJ  DLVMTL  KVUCTAIOW,  XSO  JO¬ 
DIVA  STW  T  JTCIGLQ  DJVHIAD  AZUMOJ  VY  IAAVNTXINO  AOH  KVUCTAIOW.  XSO 
KVUCZXOJ  WKIOAKO  GOCTJXUOAX  STW  KLVWO  JOLTXIVAWSICW  HIXS  UTAQ 
VY  XSOWO  V JDTAI WTXI VAW  NIT  K VLLTM V JTXIN O  CJVPOKXW,  WXTYY  WOK- 
VAGUOAXW  TAG  NIWIXIAD  IAGZWXJITL  WXTYY.  IX  STW  JOKOAXLQ  IAXJVGZKOG 
WONOJTL  UOKSTAI WU W  YVJ  GONOLVCIAD  TAG  WZCCVJXIAD  OAXJOCJOAOZJITL 
WXZGOAXW  TAG  WXTYY,  TAG  TIUW  XV  CLTQ  T  WIDAIYIKTAX  JVLO  IA  XSO 
GONOLVCUOAX  VY  SIDS-XOKSAVLVDQ  IAGZWXJQ  IA  XSO  JODIVA. 

XSO  GOCTJXUOAX  STW  T  LTJDO  CJVDJTUUO  VY  JOWOTJKS  WZCCVJXOG  MQ 
IAGZWXJQ,  XSO  OZJVCOTA  ZAIVA,  TAG  ZE  DVNOJAUOAX  JOWOTJKS  OWXTMLIW- 
SUOAXW  TAG  CZMLIK  KVJCVJTXIVAW.  T  EOQ  OLOUOAX  VY  XSIW  IW  XSO  WXJ- 
VAD  LIAEW  XSTX  XSO  GOCTJXUOAX  STW  HIXS  XSO  KVUCZXOJ,  KVUUZAIKTXIVAW, 
UIKJVOLOKXJVAIKW  TAG  UOGIT  IAGZWXJIOW  IA  XSO  MJIWXVL  JODIVA  .  XSO  TKT- 
GOUIK  JOWOTJKS  CJVDJTUUO  IW  VJDTAIWOG  IAXV  WONOA  DJVZCW,  LTADZTDOW 
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TAG  TJKSIXOKXZJO,  GIDIXTL  UOGIT,  UVMILO  TAG  HOTJTMLO  KVUCZXIAD,  UTK- 
SIAO  LOTJAIAD,  RZTAXZU  KVUCZXIAD,  WQWXOU  NOJIYIKTXIVA,  TAG  KJQCXVD- 
JTCSQ  TAG  I AYV JUTXIVA  WOKZJIXQ. 

We  can  compute  the  following  frequencies  for  single  letters  in  the  above  ciphertext. 


Letter 

Freq.  (%) 

Letter 

Freq.  (%) 

Letter 

Freq.  (%) 

A 

8.6995 

B 

0.0000 

c 

3.0493 

D 

3.1390 

E 

0.2690 

F 

0.0000 

G 

3.6771 

H 

0.6278 

I 

7.8923 

J 

7.0852 

K 

4.6636 

L 

3.5874 

M 

0.8968 

N 

1.0762 

0 

11.479 

P 

0.1793 

Q 

1.3452 

R 

0.0896 

s 

3.5874 

T 

8.0717 

U 

4.1255 

V 

7.2645 

W 

6.6367 

X 

8.0717 

Y 

1.6143 

Z 

2.7802 

In  addition  we  determine  that  the  most  common  bigrams  in  this  piece  of  ciphertext  are 

TA,  AX,  IA,  VA,  WX,  XS,  AG,  OA,  JO,  JV, 
whilst  the  most  common  trigrams  are 

OAX,  TAG,  IVA,  XSO,  KVU,  TXI,  UOA,  AXS. 

Since  the  ciphertext  letter  O  occurs  with  the  greatest  frequency,  namely  11.479,  we  can  guess  that 
the  ciphertext  letter  O  corresponds  to  the  plaintext  letter  E.  We  now  look  at  what  this  means  for 
two  of  the  common  trigrams  found  in  the  ciphertext 

•  The  ciphertext  trigram  OAX  corresponds  to  E  *  *. 

•  The  ciphertext  trigram  XSO  corresponds  to  *  *  E. 

We  examine  similar  common  trigrams  in  English,  which  start  or  end  with  the  letter  E.  We  find 
that  three  common  ones  are  given  by  ENT,  ETH  and  THE.  Since  in  the  ciphertext  trigrams  we 
have  one  letter,  A,  in  the  first  position  in  one  and  the  last  position  in  the  other,  we  look  for  a 
similar  letter  in  the  English  trigrams.  We  can  conclude  that  it  is  highly  likely  that  we  have  the 
correspondence 

•  X  =  T, 

•  S  =  H, 

•  A  =  N. 

Even  after  this  small  piece  of  analysis  we  find  that  it  is  much  easier  to  understand  what  the 
underlying  plaintext  should  be.  If  we  focus  on  the  first  two  sentences  of  the  ciphertext  we  are 
trying  to  break,  and  we  change  the  letters  for  which  we  think  we  have  found  the  correct  mappings, 
then  we  obtain: 

THE  MJIWTVL  JEDIVN  HTW  VNE  VY  EZJVCE’W  LTJDEWT  KVNKENTJTTIV  NW 
VY  HIDH  TEKHNVLVDQ  INGZWTJQ.  KVUCZTEJW,  KVUUZNIKTTIVNW  TNG 
UIKJVELEKTJVNIKW  TJE  HELL  JECJEWENTEG,  TLVNDWIGE  GIDITTL  UEGIT, 
KVUCZTEJ  DTUEW  TNG  ELEKTJVNIK  KVUUEJKE. 

Recall,  this  was  after  the  four  substitutions 

O  =  E,  X  =  T,  S  =  H,  A  =  N. 

We  now  cheat  and  use  the  fact  that  we  have  retained  the  word  sizes  in  the  ciphertext.  We  see  that 
since  the  letter  T  occurs  as  a  single  ciphertext  letter  we  must  have 

T  =  I  or  T  =  A. 
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The  ciphertext  letter  T  occurs  with  a  probability  of  8.0717,  which  is  the  highest  probability  left, 
hence  we  are  far  more  likely  to  have 

T  =  A. 

We  have  already  considered  the  most  popular  trigram  in  the  ciphertext  so  turning  our  attention 
to  the  next  most  popular  trigram  we  see  that  it  is  equal  to  TAG  which  we  suspect  corresponds  to 
the  plaintext  AN*.  Therefore  it  is  highly  likely  that  G  =  D,  since  AND  is  a  popular  trigram  in 
English.  Our  partially  decrypted  ciphertext  is  now  equal  to 

THE  MJIWTVL  JEDIVN  HAW  VNE  VY  EZJVCE’W  LAJDEWT  KVNKENTJATIV  NW 
VY  HIDH  TEKHNVLVDQ  INDZWTJQ.  KVUCZTEJW,  KVUUZNIKATIVNW  AND 
UIKJVELEKTJVNIKW  AJE  HELL  JECJEWENTED,  ALVNDWIDE  DIDITAL  UEDIA, 
KVUCZTEJ  DAUEW  AND  ELEKTJVNIK  KVUUEJKE. 

This  was  after  the  six  substitutions 

O  =  E,  X  =  T,  S  =  H, 

A  =  N,  T  =  A,  G  =  D. 

We  now  look  at  two- letter  words  which  occur  in  the  ciphertext: 

•  IX 

This  corresponds  to  the  plaintext  *T.  Therefore  the  ciphertext  letter  I  must  be  one  of  the 
plaintext  letters  A  or  I,  since  the  only  common  two-letter  words  in  English  ending  in  T 
are  AT  and  IT.  We  already  have  worked  out  what  the  plaintext  character  A  corresponds 
to,  hence  we  must  have  1  =  1. 

•  XV 

This  corresponds  to  the  plaintext  T*.  Hence,  we  must  have  V  =  O. 

•  VY 

This  corresponds  to  the  plaintext  O*.  Hence,  the  ciphertext  letter  Y  probably  corresponds 
to  one  of  F,  N  or  R.  We  already  know  the  ciphertext  letter  corresponding  to  N.  In  the 
ciphertext  the  probability  of  Y  occurring  is  1.6,  but  in  English  we  expect  F  to  occur  with 
probability  2.2  and  R  to  occur  with  probability  6.0.  Hence,  it  is  more  likely  that  Y  =  F. 

•  IW 

This  corresponds  to  the  plaintext  I*.  Therefore,  the  plaintext  character  W  must  be  one 
of  F,  N,  S  and  T.  We  already  have  F,  N,  T,  hence  W  =  S. 

All  these  deductions  leave  the  partial  ciphertext  as 

THE  MJISTOL  JEDION  HAS  ONE  OF  EZJOCE’S  LAJDEST  KONKENTJATIONS  OF 
HIDH  TEKHNOLODQ  INDZSTJQ.  KOUCZTEJS,  KOUUZNIKATIONS  AND 
UIKJOELEKTJONIKS  AJE  HELL  JECJESENTED,  ALONDSIDE  DIDITAL  UEDIA, 
KOUCZTEJ  DAUES  AND  ELEKTJONIK  KOUUEJKE. 

This  was  after  the  ten  substitutions 

O  =  E,  X  =  T,  S  =  H,  A  =  N,  T  =  A, 

G  =  D,  I  =  I,  V  =  O,  Y  =  F,  W  =  S. 

Even  with  half  the  ciphertext  letters  determined  it  is  now  quite  easy  to  understand  the  underlying 
plaintext,  taken  from  the  website  of  the  University  of  Bristol  Computer  Science  Department  circa 
2001.  We  leave  it  to  the  reader  to  determine  the  final  substitutions  and  recover  the  plaintext 
completely. 


7.4.  Vigenere  Cipher 

The  problem  with  the  shift  cipher  and  the  substitution  cipher  was  that  each  plaintext  letter  always 
encrypted  to  the  same  ciphertext  letter.  Hence  underlying  statistics  of  the  language  could  be  used 
to  break  the  cipher.  For  example  it  was  easy  to  determine  which  ciphertext  letter  corresponded 
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to  the  plaintext  letter  E.  From  the  early  1800s  onwards,  cipher  designers  tried  to  break  this  link 
between  the  plaintext  and  ciphertext. 

The  substitution  cipher  we  used  above  was  a  mono-alphabetic  substitution  cipher,  in  that  only 
one  alphabet  substitution  was  used  to  encrypt  the  whole  alphabet.  One  way  to  solve  our  problem  is 
to  take  a  number  of  substitution  alphabets  and  then  encrypt  each  letter  with  a  different  alphabet. 
Such  a  system  is  called  a  polyalphabetic  substitution  cipher. 

For  example  we  could  take 

Plaintext  alphabet  ABCDEFGHI JKLMNOPQRSTUVWXYZ 

Ciphertext  alphabet  one  TMKGO YDS IPELUAVCR JWXZNHBQF 
Ciphertext  alphabet  two  DCBAHGFEMLKJIZYXWVUTSRQPON 

Then  we  encrypt  the  plaintext  letters  in  odd-numbered  positions  encrypt  using  the  first  ciphertext 
alphabet,  whilst  we  encrypt  the  plaintext  letters  in  even-numbered  positions  using  the  second 
alphabet.  For  example,  the  plaintext  word  HELLO  would  encrypt  to  SHLJV,  using  the  above  two 
alphabets.  Notice  that  the  two  occurrences  of  L  in  the  plaintext  encrypt  to  two  different  ciphertext 
characters.  Thus  we  have  made  it  harder  to  use  the  underlying  statistics  of  the  language.  If  one  now 
does  a  naive  frequency  analysis  one  no  longer  obtains  a  common  ciphertext  letter  corresponding  to 
the  plaintext  letter  E. 

Essentially  we  are  encrypting  the  message  two  letters  at  a  time,  hence  we  have  a  block  cipher 
with  block  length  two  English  characters.  In  real  life  one  may  wish  to  use  around  five  rather  than 
just  two  alphabets  and  the  resulting  key  becomes  very  large  indeed.  With  five  alphabets  the  total 
key  space  is 

(26!)5  «  2441, 

but  the  user  only  needs  to  remember  the  key  which  is  a  sequence  of 

26  •  5  =  130 


letters.  However,  just  to  make  life  hard  for  the  attacker,  the  number  of  alphabets  in  use  should 
also  be  hidden  from  his  view  and  form  part  of  the  key.  But  for  the  average  user  in  the  early  1800s 
this  was  far  too  unwieldy  a  system,  since  the  key  was  too  hard  to  remember. 

Despite  its  shortcomings  the  most  famous  cipher  during  the  nineteenth  century  was  based  on 
precisely  this  principle.  The  Vigenere  cipher ,  invented  in  1533  by  Giovan  Battista  Bellaso,  was  a 
variant  on  the  above  theme,  but  the  key  was  easy  to  remember.  When  looked  at  in  one  way  the 
Vigenere  cipher  is  a  polyalphabetic  block  cipher,  but  when  looked  at  in  another,  it  is  a  stream 
cipher;  which  is  a  natural  generalization  of  the  shift  cipher. 

The  description  of  the  Vigenere  cipher  as  a  block  cipher  takes  the  description  of  the  polyalpha¬ 
betic  cipher  above  but  restricts  the  possible  ciphertext  alphabets  to  one  of  the  26  possible  cyclic 
shifts  of  the  standard  alphabet.  Suppose  five  alphabets  were  used,  this  reduces  the  key  space  down 
to 


26 


5 


o23 

rsj 


and  the  size  of  the  key  to  be  remembered  to  a  sequence  of  five  numbers  between  0  and  25. 

However,  the  description  of  the  Vigenere  cipher  as  a  stream  cipher  is  much  more  natural.  Just 
like  the  shift  cipher,  the  Vigenere  cipher  again  identifies  letters  with  the  numbers  0, . . . ,  25.  The 
secret  key  is  a  short  sequence  of  letters  (e.g.  a  word)  which  is  repeated  again  and  again  to  form 
a  keystream.  Encryption  involves  adding  the  plaintext  letter  to  a  key  letter.  Thus  if  the  key  is 
SESAME,  encryption  works  as  follows, 

THISISATESTMESSAGE 
SESAMESESAMESESAME 
LLASUWSXWSFQWWKAS I 

Again  we  notice  that  A  will  encrypt  to  a  different  letter  depending  on  where  it  appears  in  the 
message. 
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But  the  Vigenere  cipher  is  still  easy  to  break  using  the  underlying  statistics  of  English.  Once 
we  have  found  the  length  of  the  keyword,  breaking  the  ciphertext  is  the  same  as  breaking  the  shift 
cipher  a  number  of  times. 

Example:  As  an  example,  suppose  the  ciphertext  is  given  by 

UTPDHUG  NYH  USVKCG  MVCE  FXL  KQIB.  WX  RKU  GI  TZN,  RLS  BBHZLXMSNP 
KDKS;  CEB  IH  HKEW  IBA,  YYM  SBR  PFR  SBS,  JV  UPL  0  UVADGR  HRRWXF.  JV  ZTVOOV 
YH  ZCQU  Y  UKWGEB,  PL  UQFB  P  FOUKCG,  TBF  RQ  VHCF  R  KPG,  OU  KFT  ZCQU  MAW 
QKKW  ZGSY,  FP  PGM  QKFTK  UQFB  DER  EZRN,  MCYE,  MG  UCTFSVA,  WP  KFT  ZCQU 
MAW  KQIJS.  LCOV  NTHDNV  JPNUJVB  IH  GGV  RWX  ONKCGTHKFL  XG  VKD,  ZJM  VG 
CCI  MVGD  JPNUJ,  RLS  EWVKJT  ASGUCS  MVGD;  DDK  VG  NYH  PWUV  CCHIIY  RD  DBQN 
RWTH  PFRWBBI  VTTK  VCGNTGSF  FL  IAWU  XJDUS,  HFP  VHCF,  RR  LAWEY  QDFS 
RVMEES  FZB  CHH  JRTT  MVGZP  UBZN  FD  ATIIYRTK  WP  KFT  HIVJCI;  TBF  BLDPWPX 
RWTH  ULAW  TG  VYCHX  KQLJS  US  DCGCW  OPPUPR,  VG  KFDNUJK  GI  JIKKC  PL  KGCJ 
IAOV  KFTR  GJFSAW  KTZLZES  WG  RWXWT  VWTL  WP  XPXGG,  CJ  FPOS  VYC  BTZCUW 
XG  ZGJQ  PMHTRAIBJG  WMGFG.  JZQ  DPB  JVYGM  ZCLEWXR:  CEB  IAOV  NYH  JIKKC 
TGCWXF  UHF  JZK. 

WX  VCU  LD  YITKFTK  WPKCGVCWIQT  PWVY  QEBFKKQ,  QNH  NZTTW  IRFL  IAS 
VFRPE  ODJRXGSPTC  EKWPTGEES,  GMCG 

TTVVPLTFFJ;  YCW  WV  NYH  TZYRWH  LOKU  MU  AWO,  KFPM  VG  BLTP  VQN  RD  DSGG 
AWKWUKKPL  KGCJ,  XY  OPP  KPG  ONZTT  ICUJCHLSF  KFT  DBQNJTWUG.  DYN  MVCK 
ZT  MFWCW  HTWF  FD  JL,  OPU  YAE  CH  LQ!  PGR  UF,  YH  MWPP  RXF  CDJCGOSF,  XMS 
UZGJQ  JL,  SXVPN  HBG! 

There  is  a  way  of  finding  the  length  of  the  keyword,  which  is  repeated  to  form  the  keystream, 
called  the  Kasiski  test.  First  we  need  to  look  for  repeated  sequences  of  characters.  Recall  that 
English  has  a  large  repetition  of  certain  bigrams  or  trigrams  and  over  a  long  enough  string  of  text 
these  are  likely  to  match  up  to  the  same  two  or  three  letters  in  the  key  every  so  often.  By  examining 
the  distance  between  two  repeated  sequences  we  can  guess  the  length  of  the  keyword.  Each  of  these 
distances  should  be  a  multiple  of  the  keyword,  hence  taking  the  greatest  common  divisor  of  all 
distances  between  the  repeated  sequences  should  give  a  good  guess  as  to  the  keyword  length. 

Let  us  examine  the  above  ciphertext  and  look  for  the  bigram  WX.  The  gaps  between  some  of 
the  occurrences  of  this  bigram  are  9,  21,  66  and  30,  some  of  which  may  have  occurred  by  chance, 
whilst  some  may  reveal  information  about  the  length  of  the  keyword.  We  now  take  the  relevant 
greatest  common  divisors  to  find, 

gcd(30,  66)  =  6,  and  gcd(3,  9)  =  gcd(9, 66)  =  gcd(9,  30)  =  gcd(21,  66)  =  3. 

We  are  unlikely  to  have  a  keyword  of  length  three  so  we  conclude  that  the  gaps  of  9  and  21  occurred 
purely  by  chance.  Hence,  our  best  guess  for  the  keyword  is  that  it  is  of  length  six. 

Now  we  take  every  sixth  letter  and  look  at  the  statistics  just  as  we  did  for  a  shift  cipher  to 
deduce  the  first  letter  of  the  keyword.  We  can  now  see  the  advantage  of  using  the  histograms  or 
statistical  distance  to  break  the  shift  cipher  earlier.  If  we  used  the  naive  method  and  tried  each 
of  the  26  keys  in  turn  we  could  still  not  detect  which  key  is  correct,  since  every  sixth  letter  of  an 
English  sentence  does  not  produce  an  English  sentence.  Thus  we  need  to  resort  to  using  histograms 
or  the  statistical  distance  used  earlier. 

The  relevant  bar  charts  for  every  sixth  letter  starting  with  the  first  are  given  in  Figure  7.3.  We 
look  for  the  possible  locations  of  the  three  peaks  corresponding  to  the  plaintext  letters  A,  E  and 
T.  We  see  that  this  sequence  seems  to  be  shifted  by  two  positions  in  the  blue  graph  compared  with 
the  red  graph.  Hence  we  can  suspect  that  the  first  letter  of  the  keyword  is  C,  since  C  corresponds 
to  a  shift  of  two.  Computing  the  statistical  distance  between  the  frequency  of  letters  in  English, 
and  those  of  every  sixth  letter  in  the  ciphertext  shifted  by  a  key  k,  produces  the  results  in  Table 
7.4.  Which  indeed  confirms  our  guess  that  the  first  letter  of  the  keyword  is  C. 
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Figure  7.3.  Comparison  of  plaintext  and  ciphertext  frequencies  for  every  sixth 
letter  of  the  Vigenere  example,  starting  with  the  first  letter 
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Table  7.4.  Statistical  distance  between  X  and  Yju  for  every  sixth  letter  letter  in 
the  Vigenere  example,  starting  with  the  first  letter 


We  perform  a  similar  step  for  every  sixth  letter,  starting  with  the  second  one.  The  resulting 
bar  graphs  are  given  in  Figure  7.4.  Using  the  same  technique  we  find  that  the  blue  graph  appears 
to  have  been  shifted  along  by  17  spaces,  which  corresponds  to  the  second  letter  of  the  keyword 
being  equal  to  R.  Computing  the  statistical  distance  in  Table  7.5  again  confirms  this  guess. 

Continuing  in  a  similar  way  for  the  remaining  four  letters  of  the  keyword  we  find  the  keyword 
is 

CRYPTO. 

The  underlying  plaintext  is  then  found  to  be  the  following  text  from  A  Christmas  Carol  by  Charles 
Dickens. 

Scrooge  was  better  than  his  word.  He  did  it  all,  and  infinitely  more;  and  to  Tiny  Tim,  who  did 
not  die,  he  was  a  second  father.  He  became  as  good  a  friend,  as  good  a  master,  and  as  good  a  man, 
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Figure  7.4.  Comparison  of  plaintext  and  ciphertext  frequencies  for  every  sixth 
letter  of  the  Vigenere  example,  starting  with  the  second  letter 
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Table  7.5.  Statistical  distance  between  X  and  for  every  sixth  letter  letter  in 
the  Vigenere  example,  starting  with  the  second  letter 


as  the  good  old  city  knew,  or  any  other  good  old  city,  town,  or  borough,  in  the  good  old  world. 
Some  people  laughed  to  see  the  alteration  in  him,  but  he  let  them  laugh,  and  little  heeded  them; 
for  he  was  wise  enough  to  know  that  nothing  ever  happened  on  this  globe,  for  good,  at  which  some 
people  did  not  have  their  fill  of  laughter  in  the  outset;  and  knowing  that  such  as  these  would  be 
blind  anyway,  he  thought  it  quite  as  well  that  they  should  wrinkle  up  their  eyes  in  grins,  as  have 
the  malady  in  less  attractive  forms.  His  own  heart  laughed:  and  that  was  quite  enough  for  him. 

He  had  no  further  intercourse  with  Spirits,  but  lived  upon  the  Total  Abstinence  Principle,  ever 
afterwards;  and  it  was  always  said  of  him,  that  he  knew  how  to  keep  Christmas  well,  if  any  man 
alive  possessed  the  knowledge.  May  that  be  truly  said  of  us,  and  all  of  us!  And  so,  as  Tiny  Tim 
observed,  God  bless  Us,  Every  One! 
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7.5.  A  Permutation  Cipher 


The  ideas  behind  substitution-type  ciphers  forms  part  of  the  design  of  modern  symmetric  systems. 
For  example,  later  we  shall  see  that  both  DES  and  AES  make  use  of  a  component  called  an  S-Box, 
which  is  simply  a  substitution.  The  other  component  that  is  used  in  modern  symmetric  ciphers  is 
based  on  permutations. 

Permutation  ciphers  have  been  around  for  a  number  of  centuries.  Here  we  shall  describe  the 
simplest,  which  is  particularly  easy  to  break.  We  first  fix  a  permutation  group  Sn  for  a  small  value 
of  n,  and  a  permutation  a  G  Sn.  It  is  the  value  of  a  which  will  be  the  secret  key.  As  an  example 
suppose  we  take 


(  1  2  3  4  5 
\2  4  1  3  5 


(1243)  e  S5 . 


Now  take  some  plaintext,  say 


Once  upon  a  time  there  was  a  little  girl  called  Snow  White. 


We  break  the  text  into  chunks  of  five  letters,  and  remove  capitalisations, 


onceu  ponat  imeth  erewa  salit  tlegi  rlcal  ledsn  owwhi  te. 

We  first  pad  the  message,  with  some  random  letters,  so  that  we  have  a  multiple  of  five  letters  in 
total 


onceu  ponat  imeth  erewa  salit  tlegi  rlcal  ledsn  owwhi  teahb. 

Then  we  take  each  five-letter  chunk  in  turn  and  swap  the  letters  around  according  to  our  secret 
permutation  a.  With  our  example  permutation,  we  obtain 

coenu  npaot  eitmh  eewra  lsiat  etgli  crall  dlsdn  wohwi  atheb. 

We  then  remove  the  spaces,  so  as  to  hide  the  value  of  n,  producing  the  ciphertext 

coenunpaoteitmheewralsiatetglicralldlsdnwohwiatheb. 

However,  breaking  a  permutation  cipher  is  easy  with  a  chosen  plaintext  attack,  assuming  the  group 
of  permutations  used  (i.e.  the  value  of  n)  is  reasonably  small.  To  attack  this  cipher  we  mount 
a  chosen  plaintext  attack,  i.e.  the  attacker  selects  a  plaintext  of  their  choosing  and  asks  for  the 
encryption  of  it.  In  this  specific  example,  they  ask  one  of  the  parties  to  encrypt  the  message 

abcdefghijklmnopqrstuvwxyz, 


to  obtain  the  ciphertext 


cadbehhgj  mknlorpsqtwuxvyz . 


We  can  then  deduce  that  the  permutation  looks  something  like 


/  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15 

y  2  4  1  3  5  7  9  6  8  10  12  14  11  13  15 

We  see  that  the  sequence  repeats  (modulo  5)  after  every  five  steps  and  so  the  value  of  n  is  prob¬ 
ably  equal  to  five.  We  can  recover  the  key  by  simply  taking  the  first  five  columns  of  the  above 
permutation. 


Chapter  Summary 


Many  early  ciphers  can  be  broken  because  they  do  not  successfully  hide  the  underlying 
statistics  of  the  language. 
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•  Important  principles  behind  early  ciphers  are  those  of  substitution  and  permutation. 

•  Ciphers  can  either  work  on  blocks  of  characters  via  some  keyed  algorithm  or  simply  consist 
of  adding  some  keystream  to  each  plaintext  character. 


Further  Reading 

The  best  book  on  the  history  of  ciphers  is  that  by  Kahn.  It  is  a  weighty  tome,  so  those  wish¬ 
ing  a  more  rapid  introduction  should  consult  the  book  by  Singh.  The  book  by  Churchhouse  also 
gives  an  overview  of  a  number  of  historical  ciphers. 

R.  Churchhouse.  Codes  and  Ciphers.  Julius  Caesar,  the  Enigma  and  the  Internet.  Cambridge 
University  Press,  2001. 

D.  Kahn.  The  Codebreakers:  The  Comprehensive  History  of  Secret  Communication  from  Ancient 
Times  to  the  Internet.  Scribner,  1996. 

S.  Singh.  The  Codebook:  The  Evolution  of  Secrecy  from  Mary,  Queen  of  Scots  to  Quantum  Cryp¬ 
tography.  Doubleday,  2000. 


CHAPTER  8 


The  Enigma  Machine 


Chapter  Goals 

•  To  explain  the  working  of  the  Enigma  machine. 

•  To  explain  how  the  German  military  used  the  Enigma  machine  during  World  War  II,  in 
particular  how  session  keys  were  transmitted  from  the  sender  to  the  receiver. 

•  To  explain  how  this  enabled  Polish  and  later  British  cryptanalysts  to  read  the  German 
traffic. 

•  To  explain  the  use  of  the  Bombe  in  mounting  known  plaintext  attacks. 

8.1.  Introduction 

With  the  advent  of  the  1920s  people  saw  the  need  for  a  mechanical  encryption  device.  Taking  a 
substitution  cipher  and  then  rotating  it  was  identified  as  an  ideal  solution.  This  idea  had  actually 
been  used  previously  in  a  number  of  manual  ciphers,  but  mechanization  was  able  to  make  it  far 
more  efficient.  The  rotors  could  be  implemented  using  wires  and  then  encryption  could  be  done 
mechanically  using  an  electrical  circuit. 

By  rotating  the  rotor  we  obtain  a  new  substitution  cipher.  As  an  example,  suppose  the  rotor 
used  to  produce  the  substitutions  is  given  by  the  following  values  in  the  first  position: 

ABCDEFGHI JKLMNOPQRSTUVWXYZ 
TMKGOYDSIPELUAVCRJWXZNHBQF. 

To  encrypt  the  first  letter  we  use  the  substitutions  given  above;  i.e.  we  substitute  B  by  M  and  Y 
by  Q.  However,  to  encrypt  the  second  letter  we  rotate  the  rotor  by  one  position,  i.e.  we  move  the 
bottom  row  one  step  to  the  left,  and  so  use  the  substitutions 

ABCDEFGHI JKLMNOPQRSTUVWXYZ 
MKGOYDSIPELUAVCRJWXZNHBQFT, 

whilst  for  the  third  letter  we  use  the  substitutions 

ABCDEFGHI JKLMNOPQRSTUVWXYZ 
KGOYDSIPELUAVCRJWXZNHBQFTM, 

and  so  on.  This  gives  us  a  polyalphabetic  substitution  cipher  with  26  different  alphabets. 

The  most  famous  of  these  machines  was  the  Enigma  machine  used  by  Germany  in  World  War 
II.  We  shall  describe  the  most  simple  version  of  Enigma  which  only  used  three  such  rotors,  chosen 
from  the  following  set  of  five: 

ABCDEFGHI JKLMNOPQRSTUVWXYZ 
EKMFLGDQVZNTOWYHXUSPAIBRCJ 
A JDKS IRUXBLHWTMCQGZNPYFVOE 
BDFHJLCPRTXVZNYEIWGAKMUSQO 
ESOVPZJAYQUIRHXLNFTGKDCMWB 
VZBRGITYUPSDNHLXAWMJQOFECK. 
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Machines  in  use  towards  the  end  of  the  war  had  a  larger  number  of  rotors,  chosen  from  a  larger  set. 
Note  that  the  order  of  the  rotors  in  the  machine  is  important,  so  the  number  of  ways  of  choosing 
the  rotors  is 

5  •  4  •  3  =  60. 

Each  rotor  had  an  initial  starting  position,  and  since  there  are  26  possible  starting  positions  for 
each  rotor,  the  total  number  of  possible  starting  positions  is  263  =  17  576. 

The  first  rotor  would  step  on  the  second  rotor  on  each  full  iteration  under  the  control  of  a  ring 
hitting  a  notch;  likewise  the  stepping  of  the  third  rotor  was  controlled  by  the  second  rotor.  Both 
the  rings  were  movable  and  their  positions  again  formed  part  of  the  key,  although  only  the  notch 
and  ring  positions  for  the  first  two  rotors  were  important.  Hence,  the  number  of  ring  positions  was 
262  =  676.  The  second  rotor  also  had  a  “kick”  associated  with  it,  making  the  cycle  length  of  the 
three  rotors  equal  to 

26-25-26  =  16900. 

The  effect  of  the  moving  rotors  was  that  a  given  plaintext  letter  would  encrypt  to  a  different 
ciphertext  letter  on  each  press  of  the  keyboard.  Finally,  a  plugboard  was  used  to  swap  letters  twice 
in  each  encryption  and  decryption  operation.  This  increased  the  complexity  and  gave  another 
possible  1014  keys.  The  rotors  used,  their  order,  their  starting  positions,  the  ring  positions  and  the 
plugboard  settings  all  made  up  the  secret  key.  Hence,  the  total  number  of  keys  was  then  around 
275.  To  make  sure  encryption  and  decryption  were  the  same  operation  a  reflector  was  used.  This 
was  a  fixed  public  substitution  given  by 

ABCDEFGHI JKLMNOPQRSTUVWXYZ 
YRUHQSLDPXNGOKMIEBFZCWVJAT. 

To  encrypt  a  plaintext  character  it  would  first  be  passed  through  the  plugboard  (thus  possibly 
swapping  it  to  another  letter),  then  it  passed  forwards  through  the  rotors,  then  through  the  reflector, 
then  backwards  through  the  rotors,  and  finally  it  would  pass  once  more  through  the  plugboard. 

The  operation  of  a  simplified  four- letter  Enigma  machine  is  depicted  in  Figure  8.1.  By  tracing 
the  red  lines  one  can  see  how  the  plaintext  character  A  encrypts  to  the  ciphertext  character  D. 
Notice  that  decryption  can  be  performed  with  the  machine  in  the  same  configuration  as  used  for 
encyption.  Now  assume  that  rotor  one  moves  on  one  step,  so  A  now  maps  to  D  under  rotor  one,  B 
to  A,  C  to  C  and  D  to  B.  You  should  work  out  what  happens  with  the  example  when  we  encrypt 
A  again. 


Three  rotors 


Lamps  Keyboard  Plugs 
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Figure  8.1.  Simplified  Enigma  machine 


In  the  rest  of  this  chapter  we  present  more  details  of  the  Enigma  machine  and  some  of  the 
attacks  which  can  be  performed  on  it.  However  before  presenting  the  machine  itself  we  need  to 
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fix  some  notation  which  will  be  used  throughout  this  chapter.  In  particular  lower-case  letters  will 
denote  variables,  upper-case  letters  will  denote  “letters”  (of  the  plaintext /ciphertext  languages) 
and  Greek  letters  will  denote  permutations  in  S26  which  we  shall  represent  as  permutations  on  the 
upper  case  letters.  Hence  x  can  equal  X  and  Y,  but  X  can  only  ever  represent  X,  whereas  x  could 
represent  (XY)  or  (ABC). 

Permutations  will  usually  be  given  in  cycle  notation.  One  always  has  to  make  a  choice  as 
to  whether  we  multiply  permutations  from  left  to  right,  or  right  to  left.  We  decide  to  use  the 
left-to-right  method,  hence 

(AB  CD)  (BE)  (CD)  =  (AEBD). 

Permutations  hence  act  on  the  right  of  letters,  something  we  will  denote  by  xa ,  e.g. 

a(ABCD)(XY)  =  B 

This  is  consistent  with  the  usual  notation  of  right  action  for  groups  acting  on  sets.  See  the  appendix 
for  more  details  about  permutations. 

We  now  collect  some  basic  facts  and  theorems  about  permutations  which  we  will  need  in  the  sequel. 

Theorem  8.1.  Two  permutations  a  and  r  which  are  conjugate,  i.e.  ones  for  which  a  =  A  •  r  •  A-1 
for  some  permutation  X,  have  the  same  cycle  structure. 

We  define  the  support  of  a  permutation  to  be  the  set  of  letters  which  are  not  fixed  by  the  permuta¬ 
tion.  Hence,  if  a  acts  on  the  set  of  letters  C ,  then  as  usual  we  denote  by  CJ7  the  set  of  fixed  points 
and  hence  the  support  is  given  by 

C\Ca. 

Theorem  8.2.  If  two  permutations,  with  the  same  support,  consist  only  of  disjoint  transpositions 
then  their  product  contains  an  even  number  of  disjoint  cycles  of  the  same  length.  Conversely,  if  a 
permutation  with  support  an  even  number  of  symbols  has  an  even  number  of  disjoint  cycles  of  the 
same  length,  then  the  permutation  can  be  written  as  a  product  of  two  permutations  each  of  which 
consists  of  disjoint  transpositions. 

Solving  a  Conjugation  Problem:  In  many  places  we  need  an  algorithm  to  solve  the  following 
problem:  Given  cq,  pi  G  S26,  for  i  =  1, . . . ,  m  find  7  G  S26  such  that 

oti  —  7 -1  •  fa  •  7  for  i  —  1 . . . ,  m. 

Whilst  there  could  be  many  such  solutions  7,  in  the  situations  to  which  we  will  apply  it  we  expect 
there  to  be  only  a  few.  For  example,  suppose  we  have  one  such  equation  with 

01  =  (AFCNE)  (BWXHUJOG)  (DVIQZ)  (KLMYTRPS) , 
p 1  =  (AEYSXWUJ)  (BFZNO)(CDPKQ)  (GHIVLMRT) 

We  need  to  determine  the  structure  of  the  permutation  7  such  that 

ol\  =  7_1  •  Pi  •  7. 

We  first  look  at  what  A  should  map  to  under  7.  Suppose  AC  =  F>;  then  we  have  the  equations 

A1-oh  =Bai  =w  and  ABm  =  £7 

Thus  we  have  E 7  =  W .  We  then  look  at  the  equations 

ETa*  =  Wa‘  =  X  and  E 81 '7  =  Y7. 

So  we  have  Y1  =  X.  Continuing  in  this  way  via  a  pruned  depth- first  search  we  can  determine  a  set 
of  possible  values  for  7.  Such  an  algorithm  is  relatively  simple  to  write  down  in  C,  using  a  recursive 
procedure  call.  However,  it  is,  of  course,  a  bit  of  a  pain  to  do  this  by  hand,  as  would  have  been  the 
only  option  in  the  1930s  and  1940s. 
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8.2.  An  Equation  for  the  Enigma 

To  aid  our  discussion  in  later  sections  we  now  describe  the  Enigma  machine  as  a  permutation 
equation.  We  first  assume  a  canonical  map  between  letters  and  the  integers  {0, 1, . . . ,  25}  such  that 
0  is  A,  1  is  B  etc.  and  we  assume  a  standard  three-wheel  Enigma  machine. 

The  wheel  which  turns  the  fastest  we  shall  call  rotor  one,  whilst  the  one  which  turns  the 
slowest  we  shall  call  rotor  three.  This  means  that,  when  looking  at  a  real  machine  rotor  three  is 
the  leftmost  rotor  and  rotor  one  is  the  rightmost  rotor.  Please  keep  this  in  mind  as  it  can  cause 
confusion  (especially  when  reading  day/message  settings).  The  basic  permutations  which  make  up 
the  Enigma  machine  are  as  follows. 

Choice  of  Rotors:  We  assume  that  the  three  rotors  are  chosen  from  the  following  set  of  five 
rotors,  presented  in  the  table  below.  The  Germans  labelled  these  rotors  7,  77,  777,  IV  and  V, 
and  they  are  the  ones  used  in  the  actual  Enigma  machines.  Each  rotor  also  has  a  different  notch 
position  which  controls  how  the  stepping  of  one  rotor  drives  the  stepping  of  the  others. 


Rotor 

Permutation  Representation 

Notch 

Position 

I 

(. AELTPHQXRU){BKNW )  ( CMOY)(DFG)(IV )  (. JZ ) 

16,  i.e.  Q 

II 

( BJ)(CDKLHUP)(ESZ)(FIXVYOMW)(GR)(NT ) 

4,  i.e.  E 

1 — 1 

1 — 1 

1 — 1 

( ABDHPEJT)(CFLVMZOYQIRWUKXSG ) 

21,  i.e.  V 

IV 

(AEPLIYWCOXMRFZBSTGJQNH)(DV)(KU) 

9,  i.e.  J 

V 

(AVOLDRWFIUQ)  ( BZKSMNHYC)(EGTJPX ) 

25,  i.e.  Z 

Reflector:  A  number  of  different  reflectors  were  used  in  actual  Enigma  machines.  In  our  descrip¬ 
tion  we  shall  use  the  reflector  given  earlier,  which  is  often  referred  to  as  “Reflector  B” .  This  reflector 
has  representation  via  disjoint  cycles  as 

6  =  {AY){BR)(CU){DH){EQ){F  S)(GL){IP){J  X){KN){MO){T  Z){yW). 


An  Enigma  Key:  An  Enigma  key  consists  of  the  following  information: 

•  A  choice  of  rotors  pi,  p2,  P3  from  the  above  choice  of  five  possible  rotors.  Note  that  this 
choice  of  rotors  affects  the  three  notch  positions,  which  we  shall  denote  by  ni,  ri2  and  n 3. 
Also,  as  noted  above,  the  rotor  P3  is  placed  in  the  left  of  the  actual  machine,  whilst  rotor 
pi  is  placed  on  the  right.  Hence,  if  in  a  German  code  book  it  says  use  rotors 

7, 77, 777, 

this  means  in  our  notation  that  pi  is  selected  to  be  rotor  777,  that  P2  is  selected  to  be 
rotor  77  and  ps  is  selected  to  be  rotor  7. 

•  One  must  also  select  the  ring  positions,  which  we  shall  denote  by  r  1,  r2  and  7*3.  In  the 
actual  machine  these  are  letters,  but  we  shall  use  our  canonical  numbering  to  represent 
these  as  integers  in  {0, 1, ... ,  25}. 

•  The  plugboard  is  simply  a  product  of  disjoint  transpositions  which  we  shall  denote  by  the 
permutation  r.  In  what  follows  we  shall  denote  a  plug  linking  letter  A  with  letter  B  by 
A^B. 

•  The  starting  rotor  positions  we  shall  denote  by  pi,  P2  and  £>3.  These  are  the  letters  which 
can  be  seen  through  the  windows  on  the  top  of  the  Enigma  machine.  Remember  our 
numbering  system  is  that  the  window  on  the  left  corresponds  to  ps  whilst  the  one  on  the 
right  corresponds  to  p\. 
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The  Encryption  Operation:  We  let  a  denote  the  shift-up  permutation  given  by 

a  =  (ABC  DEFGH I J  K  LM  NOPQRSTUVW  XY  Z) . 

The  stepping  of  the  second  and  third  rotor  is  probably  the  hardest  part  to  grasp  when  first  looking 
at  an  Enigma  machine,  however  this  has  a  relatively  simple  description  when  one  looks  at  it  in  a 
mathematical  manner. 

Given  the  above  description  of  the  key  we  wish  to  deduce  the  permutation  e7 ,  which  represents, 
for  j  =  0, 1,  2, ... ,  the  encryption  of  the  j th  letter.  We  first  set 

rrii  =  n\  —  pi  —  1  (mod  26), 
rn  =  77-2  —  P2  —  1  (mod  26), 
m 2  =  mi  +  1  +  2 6  •  m. 


The  values  of  m\  and  m 2  control  the  stepping  of  the  second  and  third  rotors. 

We  let  [x\  denote  the  round  towards  zero  function,  i.e.  |_1.9J  =  1  and  - 
set,  for  encrypting  letter  j, 


fa  =  [(j  -  mi  +  26)/26j, 

k2  =  |_(j  —  m2  +  650)  / 650J , 
h  =  pi-n  + 1, 
i‘2  =  P2  ~  r2  +  ki  +  k2, 

h  =  P3~r3  +  k2 ■ 


—  1.  We  now 


Notice  how  is  is  stepped  on  every  650  =  26-25  iterations  whilst  E  is  stepped  on  every  26  iterations 
and  also  stepped  on  an  extra  notch  every  650  iterations. 

We  can  now  present  e3  as 

€j  =  r  •  (crn+J  •  pi  •  a~n~J )  •  (a12  •  p2  •  a~12)  •  ( a 23  •  ps  •  a~13)  •  g 

•(u23  •  ps1  •  cr~13)  •  (a12  •  p2  l  •  <J~12)  •  (crn+:/  •  pi~l  •  a~n~J)  •  r. 

Note  that  the  same  equation/machine  is  used  to  encrypt  the  jth  letter  as  is  used  to  decrypt  the 
jth  letter.  Hence  we  have 

ej  ~  er 

Also  note  that  each  ej  consists  of  a  product  of  disjoint  transpositions.  We  shall  always  use  7 j  to 
represent  the  internal  rotor  part  of  the  Enigma  machine,  hence 


ej  ~  T  '  7 j  '  r. 


8.3.  Determining  the  Plugboard  Given  the  Rotor  Settings 

For  the  moment,  assume  that  we  know  values  for  the  rotor  order,  ring  settings  and  rotor  positions; 
for  this  purpose  we  are  given  7 j.  We  would  like  to  determine  the  plugboard  settings.  The  goal 
is  therefore  to  determine  r  given  some  information  about  e3  for  some  values  of  j.  One  often 
sees  it  written  that  determining  the  plugboard  given  the  rotor  settings  is  equivalent  to  solving  a 
substitution  cipher.  This  is  true,  but  the  methods  given  in  some  sources  are  too  simplistic. 

Let  m  denote  the  actual  message  being  encrypted,  c  the  corresponding  ciphertext,  and  m'  the 
ciphertext  decrypted  under  the  cipher  with  no  plugboard,  i.e.  with  the  obvious  notation, 

m  =  ce, 

/  'y 

rn  =  c 1 . 

The  following  is  an  example  value  of  m'  for  a  plugboard  containing  only  one  plug 

ZNCT  UPZN  A  EIME,  THERE  WAS  A  GILL  CALLED  SNZW  WHFTE . 
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I  have  left  the  spacings  in  the  English  words.  You  may  then  deduce  that  Z  should  really  map  to 
O,  or  T  should  really  map  to  E,  or  E  should  really  map  to  T,  or  maybe  K  should  really  map  to  R, 
etc.  But  which  should  be  the  correct  plug  setting  to  achieve  this  mapping?  The  actual  correct  plug 
setting  is  that  O  should  map  to  Z;  the  other  mappings  are  the  result  of  this  single  plug  setting. 
We  now  present  some  ways  of  obtaining  information  about  the  plugboard  given  various  scenarios. 

8.3.1.  Ciphertext  Only  Attack:  In  a  ciphertext  only  attack  one  can  proceed  as  one  would 
for  a  normal  substitution  cipher.  We  need  a  method  to  distinguish  something  which  could  be 
natural  language  from  something  which  is  completely  random.  The  best  option  seems  to  be  to 
use  something  called  the  Sinkov  statistic.  Let  /),  for  i  =  A, . . . ,  Z,  denote  the  frequencies  of  the 
various  letters  in  standard  English.  For  a  given  piece  of  text  we  let  7q,  for  i  =  A, . . . ,  Z,  denote 
the  frequencies  of  the  various  letters  within  the  sample  piece  of  text.  The  Sinkov  statistic  for  the 
sample  text  is  given  by 

z 

s  =  Ejni  'A 

i=A 

The  higher  the  value  of  s,  the  more  likely  that  the  text  represents  an  extract  from  a  natural 
language. 

To  mount  a  ciphertext  only  attack  we  let  7 j  denote  our  current  approximation  for  e3  (initially  7 j 
has  no  plug  settings,  but  this  will  change  as  the  method  progresses).  We  now  go  through  all  possible 
single-plug  settings,  There  are  26-25/2  =  325  of  these.  We  then  decrypt  the  ciphertext  c 

using  the  cipher 

a ^  •  7 j  •  a^k\ 

This  results  in  325  possible  plaintext  messages  m ^  for  k  =  1, . . .  ,325.  For  each  one  of  these  we 
compute  the  Sinkov  statistic  and  keep  the  value  of  which  results  in  being  maximized. 
We  then  set  our  new  7 j  to  be  •  7 j  •  and  repeat  until  no  further  improvement  can  be  made 
in  the  test  statistic. 

This  methodology  seems  very  good  at  finding  the  missing  plugboard  settings.  Suppose  we  are 
given  that  the  day  setting  is 


Rotors 

Rings 

Pos 

Plugboard 

III ,  //,  I 

PPD 

MDL 

Unknown 

The  actual  hidden  plugboard  is  given  by  A  AA  B,  C  AT  D,  E  VA  F,  G  AT  7L,  I  VA  J  and  K  va  L.  We 
obtain  the  ciphertext 

HUCDODANDHOMYXUMGLREDSQQJDNJAEXUKAZOYGBYLEWFNWIBWILSMAETFFBVPR 

GBYUDNAAIEVZZKCUFNIUTOKNKAWUTUWQJYAUHMFWJNIQHAYNAGTDGTCTNYKTCU 

FGYQBSRRUWZKZFWKPGVLUHYWZCZSOYJNXHOSKVPHGSGSXEOQWOZYBXQMKQDDXM 

BJUPSQODJNIYEPUCEXFRHDQDAQDTFKPSZEMASWGKVOXUCEYWBKFQCYZBOGSFES 

OELKDUTDEUQZKMUIZOGVTWKUVBHLVXMIXKQGUMMQHDLKFTKRXCUNUPPFKWUFCU 

PTDMJBMMTPIZIXINRUIEMKDYQFMIQAEVLWJRCYJCUKUFYPSLQUEZFBAGSJHVOB 

CHAKHGHZAVJZWOLWLBKNTHVDEBULROARWOQGZLRIQBVVSNKRNUCIKSZUCXEYBD 

QKCVMGLRGFTBGHUPDUHXIHLQKLEMIZKHDEPTDCIPF 

The  plugboard  settings  are  found  in  the  following  order  JaaJ,  EaaF,  A  E>,  GaaTY,  A  fA  L 
and  C  AA  D.  The  plaintext  is  determined  to  be: 

ITWASTHEBESTOFTIMESITWASTHEWORSTOFTIMESITWASTHEAGEOFWISDOMITWA 

STHEAGEOFFOOLISHNESSITWASTHEEPOCHOFBELIEFITWASTHEEPOCHOFINCRED 

ULITYITWASTHESEASONOFLIGHTITWASTHESEASONOFDARKNESSITWASTHESPRI 

NGOFHOPEITWASTHEWINTEROFDESPAIRWEHADEVERYTHINGBEFOREUSWEHADNOT 

HINGBEFOREUSWEWEREALLGOINGDIRECTTOHEAVENWEWEREALLGOINGDIRECTTH 

EOTHERWAYINSHORTTHEPERIODWASSOFARLIKETHEPRESENTPERIODTHATSOMEO 
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FITSNOISIESTAUTHORITIESINSISTEDONITSBEINGRECEIVEDFORGOODORFORE 

VILINTHESUPERLATIVEDEGREEOFCOMPARISONONLY 

8.3.2.  Known  Plaintext  Attack:  When  one  knows  the  plaintext  there  is  a  choice  of  two  methods 
one  can  employ.  The  first  method  is  simply  based  on  a  depth- first  search  technique,  whilst  the 
second  makes  use  of  some  properties  of  the  encryption  operation. 

Technique  One:  In  the  first  technique  we  take  each  wrong  letter  in  turn,  from  our  current  ap¬ 
proximation  t j  to  ej .  In  the  above  example  of  the  encryption  of  the  first  sentences  from  “A  Tale  of 
Two  Cities” ,  we  have  that  the  first  ciphertext  letter  H  should  map  to  the  plaintext  letter  I.  This 
implies  that  the  plugboard  must  contain  plug  settings  H  AT  pu  and  I  AT  p/,  for  letters  pu  and  pi 
with 

PH 70  =  Pi- 

In  a  similar  manner  we  deduce  the  following  further  equations: 

Pu 71  =  Pt ,  Pc 72  =  Pw,  Pd 73  =  PA, 

Po1A  =  PS,  Pd 75  =  Pt,  PA 76  =  PH, 

Pn 77  =  Pe,  Pd 78  =  Pb,  Ph 79  =  Pe- 

The  various  permutations  representing  the  first  few  yy s  for  the  given  rotor  and  ring  positions  are 
as  follows: 

70  =  (AW)(BH)(CZ)(DE)(FT)(GJ)(IN)(KL)(MQ)(OV)(PU)(RS)(XY), 

71  =  (AZ)  ( BL )  (CE)  ( DH )  (FK)  ( GJ )  (IS)  (MX)  (NQ)  (OY)  (PR)  (TU)  (VW) , 

72  =  (AZ)(BJ)(CV)(DW)(EP)(FX)(GO)(HS)(IY)(KL)(MN)(QT)(RU), 

73  =  (AF)(BC)(DY)(EO)(GU)(HK)(IV)(JR)(LX)(MN)(PW)(QS)(TZ), 

74  =  (AJ)  (BD)  (CF)  (EL)  (GN)  (HX)  (IM)  (KQ)  (OS)  (PV)  (RT)  (UY)  (WZ) , 

75  =  (AW)  (BZ)  (CT)  (DI)  (EH)  (FV)  (GU)  (J  O)  (KP)  (LN)  (MX)(QY)(RS) , 

76  =  (AL)(BG)(CO)(DV)(EN)(FS)(HY)(IZ)(JT)(KW)(MP)(QR)(UX), 

77  =  (AI)  (BL)  (CT)  (DE)  (FN)  (GH)  (JY)  (KZ)  (MO)  (PS)  (QX)  (RU)  (VW) , 

78  =  (AC)(BH)(DU)(EM)(FQ)(GV)(IO)(JZ)(KS)(LT)(NR)(PX)(WY), 

79  =  (AB)  (CM)  (DY)  (EZ)  (FG)  (HN)  (IR)  (JX)  (KV)  (LW)  (OT)  (PQ)  (SU) . 

We  now  proceed  as  follows:  suppose  we  know  that  exactly  six  plugs  are  being  used.  This  means 
that  if  we  pick  a  letter  at  random,  say  T,  then  there  is  a  14/26  =  0.53  chance  that  this  letter  is 
not  plugged  to  another  one.  Let  us  therefore  make  this  assumption  for  the  letter  T,  in  which  case 
Pt  =  T.  From  the  above  equations  involving  71  and  75  we  then  deduce  that 

Pu  =  U  and  pu>  =  C. 

We  then  use  the  equations  involving  73  and  78,  since  we  now  know  pjj,  to  deduce  that 

PA  =  B  and  pu  =  A. 

These  last  two  checks  are  consistent,  so  we  can  assume  that  our  original  choice  of  pr  =  T  was  a 
good  one.  From  the  equations  involving  7 0,  using  pa  =  B  we  deduce  that 

Ph  =  G. 

Using  this  in  the  equations  involving  70  and  79  we  deduce  that 

Pi  =  J  and  pe  =  F. 

We  then  find  that  our  five  plug  settings  of  A  TA  B,  C  TA  D,  E  AA  F,  G  AA  H  and  I  AA  J  allow  us 
to  decrypt  the  first  ten  letters  correctly.  To  deduce  the  final  plug  setting  will  require  a  longer  piece 
of  ciphertext,  and  the  corresponding  piece  of  known  plaintext. 
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This  technique  can  also  be  used  when  one  knows  partial  information  about  the  rotor  positions. 
For  example,  many  of  the  following  techniques  will  allow  us  to  deduce  the  differences  pi  —  77,  but 
not  the  actual  values  of  77  or  pi.  However,  by  applying  the  above  technique,  on  assuming  77  =‘A’, 
we  will  at  some  point  deduce  a  contradiction.  At  this  point  we  know  that  either  a  rotor  turnover 
has  occurred  incorrectly,  or  one  has  not  occurred  when  it  should  have  done.  Hence,  we  can  at  this 
point  backtrack  and  deduce  the  correct  turnover.  For  an  example  of  this  technique  at  work  see  the 
later  section  on  the  Bombe. 

Technique  Two:  A  second  method  is  possible  when  fewer  than  thirteen  plugs  are  used.  In  the 
plaintext  obtained  under  7 j  a  number  of  incorrect  letters  will  appear.  Again  we  let  m  denote  the 
actual  plaintext  and  m'  the  plaintext  derived  under  the  current  (possibly  empty)  plugboard  setting. 
We  suppose  that  there  are  t  plugs  left  to  find. 

Suppose  we  concentrate  on  each  place  for  which  the  incorrect  plaintext  letter  A  occurs,  i.e. 
all  occurrences  of  A  in  the  plaintext  m  which  are  wrong  in  m! .  Let  x  denote  the  corresponding 
ciphertext  letter.  There  are  two  possible  cases  which  can  occur 

•  The  letter  x  should  be  plugged  to  an  unknown  letter.  In  this  case  the  resulting  letter  in 
the  message  m'  will  behave  randomly  (assuming  7 j  acts  like  a  random  permutation). 

•  The  letter  x  does  not  occur  in  a  plugboard  setting.  In  this  case  the  resulting  incorrect 
plaintext  character  is  the  one  which  should  be  plugged  to  A  in  the  actual  cipher. 

Assuming  ciphertext  letters  are  uniformly  distributed,  the  first  case  will  occur  with  probability 
t/ 13,  whilst  the  alternative  will  occur  with  probability  1  —  t/ 13.  This  gives  the  following  method 
for  determining  the  letter  to  which  A  should  be  connected.  For  all  occurances  of  A  in  the  plaintext 
m  compute  the  frequency  of  the  corresponding  letter  in  the  approximate  plaintext  m' .  The  one 
with  the  highest  frequency  is  highly  likely  to  be  the  one  which  should  be  connected  to  A  on  the 
plugboard.  Indeed  we  expect  the  proportion  of  such  positions  with  the  correct  letter  to  be  given 
by  1  —  tj  13,  whilst  all  other  letters  we  expect  to  occur  in  proportions  of  t/(  13  •  26)  each. 

The  one  problem  with  this  second  technique  is  that  it  requires  a  relatively  large  amount  of 
known  plaintext.  Hence,  in  practice  the  first  technique  is  more  likely  to  be  used. 

Knowledge  of  e3  for  Some  j s:  If  we  know  the  value  of  the  permutation  ej  for  values  of  j  G  <S, 
then  we  have  the  following  equation 

€j  =  r  •  7 j  •  r  for  j  G  S. 

Since  r  =  t-1  we  can  thus  compute  possible  values  of  r  using  our  previous  method  for  solving  this 
conjugation  problem.  This  might  not  determine  the  whole  plugboard  but  it  will  determine  enough 
for  other  methods  to  be  used. 

Knowledge  of  ej  •  E/+ 3  for  Some  j s:  A  similar  method  to  the  previous  one  applies  in  this  case 
as  well,  since,  if  we  know  e7  •  E/+ 3  for  all  j  G  S  and  we  know  7 p  we  can  use  the  equation 

Oj  •  <7+ 3)  =  T  ■  (rij  ■  7j+3)  •  r  for  j  e  S. 

The  utility  of  a  method  upon  knowing  ej  •  E/+3  will  become  apparent  in  a  little  while. 

8.4.  Double  Encryption  of  Message  Keys 

The  Polish  mathematicians  Jerzy  Rozycki,  Henryk  Zygalski  and  Marian  Rejewski  were  the  first  to 
find  ways  of  analysing  the  Enigma  machine.  To  understand  their  methods  one  must  first  understand 
how  the  Germans  used  the  machine.  On  each  day  the  machine  was  set  up  with  a  key,  as  above, 
which  was  chosen  by  looking  in  a  code  book;  each  subnet  would  have  a  different  day  key. 

To  encipher  a  message  the  sending  operator  decided  on  a  message  key.  The  message  key  would 
be  a  sequence  of  three  letters,  say  DHI ,  which  would  need  to  be  transported  to  the  recipient.  Using 
the  day  key,  the  message  key  would  be  enciphered  twice.  The  double  enciphering  was  to  act  as  a 
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form  of  error  control.  Hence,  DHI  might  be  enciphered  as  XHJKLM.  Note  that  D  encrypts  first 
to  X  and  then  to  K . 

The  receiver  would  obtain  XHJKLM  and  then  decrypt  this  to  obtain  DHI.  Both  operators 
would  then  move  the  wheels  around  to  the  positions  D,  H  and  /,  i.e.  they  would  turn  the  wheels 
so  that  D  was  in  the  leftmost  window,  H  in  the  middle  one  and  I  in  the  rightmost  window.  Then 
the  actual  message  would  be  enciphered.  For  this  example,  in  our  notation,  this  would  mean  that 
the  message  key  was  equal  to  the  day  key,  except  that  pi  =  8,  i.e.  I,  p2  =  7,  i.e.  H  and  p3  =  3,  i.e. 
D. 

Suppose  we  intercept  a  set  of  messages  which  have  the  following  headers,  consisting  of  the 
encryption  of  the  three-letter  rotor  positions,  followed  by  its  encryption  again,  i.e.  the  first  six 
letters  of  each  message  are  equal  to 


UCWBLR 

ZSETEY 

SLVMQH 

SGIMVW 

PMRWGV 

VNGCTP 

OQDPNS 

CBRVPV 

KSCJEA 

GSTGEU 

DQLSNL 

HXYYHF 

GETGSU 

EEKLSJ 

OSQPEB 

WISIIT 

TXFEHX 

ZAMTAM 

VEMCSM 

LQPFNI 

LOIFMW 

JXHUHZ 

PYXWFQ 

FAYQAF 

QJPOUI 

EPILWW 

DOGSMP 

ADSDRT 

XLJXQK 

BKEAKY 

DDESRY 

QJCOUA 

JEZUSN 

MUXROQ 

SLPMQI 

RRONYG 

ZMOTGG 

XUOXOG 

HIUYIE 

KCPJLI 

DSESEY 

OSPPEI 

QCPOLI 

HUXYOQ 

NYIKFW 

Let  us  take  the  last  one  of  these  and  look  at  it  in  more  detail.  We  know  that  there  are  three 
underlying  secret  letters,  say  Zi,Z2  and  Z3.  We  also  know  that 

heo  =  N,  h€l  =  Y,  h 62  =  /, 

and 

h 63  =  K,  Z2 64  =  F,  l3 65  =  W. 

Hence,  given  that  e^-1  =  ej,  we  have 

jy€0€3  _  ^606063  _  ^€3  _  y^€l€4  _  jp  265  _ 

Continuing  in  this  way  we  can  compute  a  permutation  representation  of  the  three  products  as 
follows: 

e0  •  e3  =  ( ADSMRNKJUB )  (CV)  (. ELFQOPWIZT )  ( HY ) , 
ei  •  e4  =  ( BPWJUOMGV )  ( CLQNTDRYF )  (ES)  (HX), 
e2  •  e5  =  (AC)  ( BDSTUEYFXQ )  ( GPIWRVHZNO )  ( JK ) . 


8.5.  Determining  the  Internal  Rotor  Wirings 

However,  life  was  even  more  difficult  for  the  Polish  analysts  as  they  did  not  even  know  the  rotor 
wirings  or  the  reflector  values.  Hence,  they  needed  to  break  the  scheme  without  even  having  a 
description  of  the  actual  machine.  They  did  at  least  have  access  to  a  non-military  version  of 
Enigma  and  deduced  the  basic  structure.  In  this  they  had  two  bits  of  luck: 

(1)  They  deduced  that  the  wiring  between  the  plugboard  and  the  rightmost  rotor  was  in  al¬ 
phabetical  order.  Had  this  not  been  the  case  they  would  have  needed  to  find  an  additional, 
hidden  permutation. 

(2)  Secondly,  the  French  cryptographer  Gustave  Bertrand  obtained  from  a  German  spy,  Hans- 
Thilo  Schmidt,  two  months’  worth  of  day  keys.  Thus,  for  two  months  of  traffic  the  Poles 
had  access  to  the  day  settings. 
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From  this  information  they  needed  to  deduce  the  internal  wirings  of  the  Enigma  machine. 

Note  that  in  the  pre-war  days  the  Germans  only  used  three  wheels  out  of  a  choice  of  three, 
hence  the  number  of  day  keys  is  actually  reduced  by  a  factor  of  ten.  This  is,  however,  only  a  slight 
simplification  (at  least  with  modern  technology).  To  explain  the  method  suppose  we  are  given  that 
the  day  setting  is 


Rotors 

Rings 

Pos 

Plugboard 

III ,  //,  I 

TXC 

EAZ 

( AMTEBC ) 

We  do  not  know  what  the  actual  rotors  are  at  present,  but  we  know  that  the  one  labelled  rotor  I 
will  be  placed  in  the  rightmost  slot  (our  label  one).  So  we  have 

n  =2,  r2  =  23,  r3  =  19, pi  =  25,  p2  =  0,  p3  =  4. 

Suppose  also  that  the  data  from  the  previous  section  was  obtained  as  traffic  for  that  day.  Hence, 
we  obtain  the  following  three  values  for  the  products  e3  •  e^+i, 

e0  •  e3  =  ( ADSMRNKJUB )  (CV)  ( ELFQOPWIZT )  ( HY ) , 
e i  •  e4  =  ( BPWJUOMGV )  ( CLQNTDRYF )  (ES)  (HX), 
e2  •  e5  =  (AC)  ( BDSTUEYFXQ )  ( GPIWRVHZNO )  ( JK ) . 

From  these  we  wish  to  deduce  the  values  of  eo,  e3, . . . ,  65.  We  will  use  the  fact  that  e3  is  a  product 
of  disjoint  transpositions  and  Theorem  8.2. 

We  take  the  first  product  and  look  at  it  in  more  detail.  We  take  the  sets  of  two  cycles  of  equal 
degree  and  write  them  above  one  another,  with  the  bottom  one  reversed  in  order,  i.e. 

ADSMRNKJUB  C  V 

TZIWPOQFLE  Y  H 

We  now  run  through  all  possible  shifts  of  the  bottom  rows.  Each  shift  gives  us  a  possible  value  of 
60  and  e3.  The  value  of  60  is  obtained  from  reading  off  the  disjoint  transpositions  from  the  columns, 
the  value  of  e3  is  obtained  by  reading  off  the  transpositions  from  the  “off  diagonals” .  For  example 
with  the  above  orientation  we  would  have 

60  =  ( AT)(DZ)(SI)(MW)(RP)(NO)(KQ)(JF)(UL)(BE)(CY)(VH ), 

63  =  (DT)  ( SZ )  (MI)  (RW)  (NP)  ( KO)(JQ)(UF )  (BL)  (AE)  (VY)  ( CH ) . 

This  still  leaves  us,  in  this  case,  with  20  =  2  •  10  possible  values  for  60  and  e3. 

Now,  to  reduce  this  number  we  need  to  rely  on  operational  errors  by  German  operators.  Various 
operators  had  a  tendency  to  always  select  the  same  three  letter  message  key.  For  example,  popular 
choices  where  QWE  (the  first  letters  on  the  keyboard).  One  operator  used  the  letters  of  his 
girlfriend’s  name,  Cillie,  hence  such  “cribs”  (or  guessed/known  plaintexts  in  today’s  jargon)  became 
known  as  “cillies”.  Note,  for  our  analysis  here  we  only  need  one  cillie  for  the  day  when  we  wish  to 
obtain  the  internal  wiring  of  rotor  I. 

In  our  dummy  example,  suppose  we  guess  (correctly)  that  the  first  message  key  is  indeed  QWE , 
and  that  U CW BLR  is  the  encryption  of  QWE  twice.  This  in  turn  tells  us  how  to  align  our  cycle 
of  length  10  in  the  first  permutation,  as  under  cq  the  letter  Q  must  encrypt  to  U . 


ADSMRNKJUB 
L  E  T  Z  I  W  P  O  Q  F 
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We  can  check  that  this  is  consistent  as  we  see  that  Q  under  63  must  then  encrypt  to  B.  Assuming 
we  carry  on  in  this  way  we  will  finally  deduce  that 

e0  =  ( AL)(BF)(CH)(DE)(GX)(IR)(JO)(KP)(MZ)(NW)(QU)(ST)(VY ), 
ei  =  ( AK)(BQ)(CW)(DM)(EH)(FJ)(GT)(IZ)(LP)(NV)(OR)(SX)(UY ), 
e2  =  ( AJ)(BN)(CK)(DZ)(EW)(FP)(GX)(HS)(IY)(LM)(OQ)(RU)(TV ), 
e3  =  (AF)(BQ)(CY)(DL)(ES)(GX)(HV)(IN)(JP)(KW)(MT)(OU)(RZ), 
e4  =  (AK)(BN)(CJ)(DG)(EX)(FU)(HS)(IZ)(LW)(MR)(OY)(PQ)(TV), 
e5  =  ( AK)(BO)(CJ)(DN)(ER)(FI)(GQ)(HT)(LM)(PX)(SZ)(UV)(WY ). 

We  now  need  to  use  this  information  to  deduce  the  value  of  pi,  etc.  So  for  the  rest  of  this  section 
we  assume  that  we  know  the  e3  for  j  =  0, . . . ,  5,  and  so  we  mark  them  in  blue.  Recall  that  we  have 

€j  =  r  •  (<jll+3  •  pi  •  a~n~3)  •  (1 a 12  •  p2  •  a~1'2)  •  (a13  •  p%  •  a-23)  •  g 

•((J13  •  P3  1  ‘  (J~%3 )  •  (H2  •  P2_1  '  <T_22)  •  (<Tn+J  •  p\~l  •  •  T 

We  now  assume  that  no  stepping  of  the  second  rotor  occurs  during  the  first  six  encryptions  under  the 
day  setting.  This  holds  with  quite  high  probability,  namely  20/26  ~  0.77.  Should  the  assumption 

turn  out  to  be  false  we  will  notice  in  our  later  analysis  and  it  will  mean  that  we  can  deduce 

something  about  the  (unknown  to  us  at  this  point)  position  of  the  notch  on  the  first  rotor. 

Given  that  we  know  the  day  settings,  including  r  and  the  values  of  ii,Z2  and  is  (since  we  are 
assuming  k\  =  —  0  for  0  <  j  <  5),  we  can  write  the  above  equation  for  0  <  j  <  5  as 

A  j  =  (j~n~3  •  r  •  €j  •  r  •  cd1+J 

=  pi  *  (7~J  *  7  *  (J3  •  pl~l . 

Where  A  j  is  now  known  and  we  wish  to  determine  pi  for  some  fixed  but  unknown  value  of  7.  The 
permutation  7  is  in  fact  equal  to 

7  =  {crl2~n  •  p2  •  V~12)  •  (cr13  •  ps  •  CT-*3)  •  Q  •  (cb3  •  p3_1  •  a~13)  •  (a12  •  p2_1  *  crn~12). 

In  our  example  we  get  the  following  values  for  Ay : 

A0  =  (AD)  ( BR )  (GQ)  (EV)  (FZ)  (GP)  (HM)  (IN)  (JK)  (LU)  (OS)  (TW)  (AT) , 

Ai  =  (AV)  (BP)  (CZ)  (DF)  (EI)  (GS)  (HY)  (JL)  (KO)  (MU)  (NQ)  (RW)  (TX) , 

A2  =  (AL)  (BK)  (CN)  (DZ)  (EV)  (FP)  (GX)  (HS)  (IY)  (JM)  (OQ)  (RU)  (TW) , 

A3  =  (AS)  (BF)  (CZ)  (DR)  (EM)  (GN)  (HY)  (IW)  (JO)  (KQ)  (LX)  (PV)  (TU) , 

A4  =  (AO)  (BK)  (CT)  (DL)  (EP)  (FI)  (GX)  (HW)  (JU)  (MO)  (NY)  (RS)  (VZ) , 

A5  =  (AS)  (BZ)  (CV)  (DO)  (EM)  (FR)  (GQ)  (HK)  (IL)  (JT)  (NP)  (UW)  (AT) . 

We  now  form,  for  j  =  0, . . . ,  4, 

Tj  =  '  Ah  T 

— 7  — 1  7+1  —  1 

=  pi-<7J‘7*<7  •  7  •  <JJ  •  pi  , 

=  pi  •  (7  3  -5  •  (J3  •  pi  ^ , 

where  S  =  7  •  <r_1  •  7  •  a  is  unknown.  Eliminating  S  via  S  =  a J_1  •  pi-1  •  Rj-ipi  •  cr_J+1  we  hnd  the 
following  equations  for  j  =  1, . . . ,  4, 

Pj  =  (pi  •  (J  1  •  pi  1)  •  Pj-i  •  (pi  •  <T  •  pi  1)? 

Q;  •  pLj—l  '  Oi  \ 
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where  a  =  pi  •  a  1  •  pi  1.  Hence,  pj  and  Mj-i  are  conjugate  and  so  by  Theorem  8.1  have  the  same 
cycle  structure.  For  our  example  we  have 


Mo  =  ( AFCNE)(BWXHUJOG)(DVIQZ)(KLMYTRPS ), 
Mi  =  ( AEYSXWUJ )  ( BFZNO )  ( CDPKQ )  ( GHIVLMRT ) , 
M2  =  (- AXNZRTIH )  ( BQJEP )  ( CGLSYWUD )  ( FVMOK ) , 
Ms  =  ( ARLGYWFK)(BIHNXDSQ)(CVEOU)(JMPZT ), 
M4  =  ( AGYPMDIR )  ( BHUTV )  ( CJWKZ )  ( ENXQSFLO ) . 


At  this  point  we  can  check  whether  our  assumption  of  no-stepping,  i.e.  a  constant  value  for  the 
values  of  ^2  and  23,  is  valid.  If  a  step  did  occur  in  the  second  rotor  then  the  above  permutations 
would  be  unlikely  to  have  the  same  cycle  structure. 

We  need  to  determine  the  structure  of  the  permutation  <a;  this  is  done  by  looking  at  the  four 
equations  simultaneously.  We  note  that  a  and  a  are  conjugates,  under  pi,  and  we  know  that  a  has 
cycle  structure  of  a  single  cycle  of  length  26,  since  a  is  the  shift-left  permutation.  In  our  example 
we  only  find  one  possible  solution  for  a,  namely 


a  =  ( AGYWUJOQNIRLSXHTMKCEBZVPFD ). 


To  solve  for  pi  we  need  to  find  a  permutation  such  that 


OL  —  pi  *  <7 
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We  find  there  are  26  such  solutions 

(. AELTPHQXRU )  ( BKNW )  ( CMOY )  ( DFG )  (IV)  (JZ) 

(. AFHRVJ )  (. BLU )  ( CN  XSTQY  DGEMPIW )  (. KOZ ) 

(. AGFIXTRWDHSUCO)(BMQZLVKPJ)(ENY ) 

(AHTSVLWEOBN  ZMRXUDIYFJCPKQ) 

(. AIZN )  ( BOCQ )  (DJ)  (EPLXVM  SW  FKRY  GHU) 

(. AJEQCRZODKSXWGI )  (BPMTUFLYHVN) 

(. AKTVOER )  ( BQDLZPNCSYI )  (. FMUGJ )  ( HW ) 

( AL )  (. BR )  ( CTWI )  ( DMVPOFN )  (ESZQ)(GKU  HXY  J) 

(AMW  JHY  KV  QFOGLBS)  (CUIDNETXZR) 

(AN  F  PQGM  X)(BTY  LCV  RDOH  Z  S)  (EUJI)  (KW) 

(AOIFQH)  (BUKX)  (CWLDPREVS)(GN)  (MY)  (TZ) 
(APSDQIGOJKYNHBVT)  (CX)(EWMZUL)  (FR) 
(AQJLFSEXDRGPTBWNIHCY  OKZVUM) 
(ARHDSFTCZWOLGQK)  (BXEYPUNJM) 

(. ASGRIJNKBYQLHEZXFUOMC )  (DT)  (PVW) 

(ATE)  (BZY  RJONLIKC)  (DUPWQM)  (FVXGSH) 

(AUQNMEB)  (DVYSILJPXHGTFWRK) 

(AVZ)  (CDWSJQOPYTGURLKE)  (FXIM) 
(AWTHINOQPZBCEDXJRMGV)  (FYUSK) 

(AXKGWUTIORN  P)(BDYV)(C  F  Z)(H  J  SLM) 
(AYWVCGXLNQROSMIPBEF)  (DZ)  (HK)  ( JT) 
(AZEGYXMJUVD)(BF)  (CHLOTKIQSNRP) 

(BGZFCIRQTLPD)  (EHMKJV)  (NSOUWX) 
(ABHNTMLQUXOVFDCJWYZG)  (EISP) 

(ACKLRSQVGBITNUY)  (EJXPF)  (HOWZ) 
(ADEKMNVHPGCLSRTOXQW)(BJY)  (IUZ) 

These  are  the  values  of  pi  ■  a *,  for  i  =  0, . . .  ,25.  So  with  one  day’s  messages  we  can  determine 
the  value  of  pi  up  to  multiplication  by  a  power  of  a.  The  Polish  had  access  to  two  months’  such 
data  and  so  were  able  to  determine  similar  sets  for  p2  and  p\  (as  different  rotor  orders  are  used 
on  different  days).  Note  that,  at  this  point,  the  Germans  did  not  use  a  selection  of  three  from  five 
rotors. 

If  we  select  three  representatives  pi ,  p2  and  p:>, .  from  the  sets  of  possible  rotors,  then  we  have 


Pi  =  Pi  ■  , 

P2  =  P2  ■  <?h , 

P3  =  P3  '  ■ 

However,  we  still  do  not  know  the  value  for  the  reflector  g,  or  the  correct  values  of  Zi,  I2  and  Z3.  To 
understand  how  to  proceed  we  present  the  following  theorem. 

Theorem  8.3.  Consider  an  Enigma  machine  £  that  uses  rotors  pi,P2  and  p%,  and  reflector  g. 

A 

Then  there  is  an  Enigma  machine  £  using  rotors  pi,  p2  and  p%,  and  a  different  reflector  g  such 

A 

that,  for  every  setting  of  £,  there  is  a  setting  of  £  such  that  the  machines  have  identical  behaviour. 
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Furthermore,  £  can  be  constructed  so  that  the  machines  use  identical  day  settings  except  for  the 
ring  positions. 

Proof.  The  following  proof  was  shown  to  me  by  Eugene  Luks;  I  thank  him  for  allowing  me  to 
reproduce  it  here.  The  first  claim  is  that  g  is  determined  via 

0  =  a~(h+h+l s)  •  q  ■  a-(h+h+h). 

We  can  see  this  by  the  following  argument  (and  the  fact  that  the  reflector  is  uniquely  determined 
by  the  above  equation).  Define  the  following  function 

=  T  ■  (ah  ■  01  ■  (J  ^1 )  ■  (V2  ■  02  ■  O'-*2)  •  (V3  ■  03  •  (7-*3)  •  i> 

■  (cr*3  •  00 1  •  cr-*3)  •  (a*2  •  00 1  •  <r-*2)  ■  (cr*1  •  00 1  •  cr-*1)  •  r 

We  then  have  the  relation, 

P(Pl,P2,  P3,  Q,  ts)  =  P(pi,  P2 ,  P3?  Q,  tl^2  +  h,t 3  +  l\  +  A)* 

Recall  the  following  expressions  for  the  functions  which  control  the  stepping  of  the  three  rotors: 

h  =  l(j  -  mi  +26)/26j, 
k2  =  [(I  _m2  +  650) / 650J , 

h  =  pi-n  +  l, 

*2  =  P2  ~  ^2  +  k\  +  A:2, 

*3  =  P3-r3  +  k2. 

The  Enigma  machine  £  is  given  by  the  equation 

0  P2>  P3 ;  T  j'4  2;^3) 

A 

where  we  interpret  22  and  23  as  functions  of  j  as  above.  We  now  set  the  ring  positions  in  £  to  be 
given  by 

ri  ,r2  +  Zi,r3  +  h  +  h 

in  which  case  we  have  that  the  output  of  this  Enigma  machine  is  given  by 

€j  =  P(/R,  P2,  P3,  £,  M  +  j,  ^2  —  h  N3  —  h  —  h)- 

But  then  we  conclude  that  ej  =  ej.  □ 

We  now  use  this  result  to  fully  determine  £  from  the  available  data.  We  pick  values  of  pi,  p2  and 
P3  and  determine  a  possible  reflector  by  solving  for  g  in 

eo  —  r  '  (a21  •  pi  ■  cr~n)  •  (cr22  •  p2  •  a~2'2)  •  ( a 13  •  ps  •  <j~23)  •  g 

fa13  •  P3_1  •  (J~13)  •  ( a 22  •  P2-1  *  cr~22)  •  (<r?1  •  p\~l  •  cr~n)  •  r 

We  let  £  denote  the  Enigma  machine  with  rotors  given  by  pi,  P2,  ps  and  reflector  p,  but  with  ring 
settings  the  same  as  in  the  target  machine  £  (we  know  the  ring  settings  of  £  since  we  have  the  day 

A  .  A 

key).  Note  that  £l  7^  £  from  the  above  proof,  since  the  rings  are  in  the  same  place  as  in  the  target 
machine. 

Assume  we  have  obtained  a  long  message,  with  a  given  message  key.  We  put  the  machine  £v 
in  the  message  key  configuration  and  start  to  decrypt  the  message.  This  will  work  (i.e.  produce 
a  valid  decryption)  up  to  the  point  when  the  sequence  of  permutations  ej  produced  by  £l  differs 
from  the  sequence  Cj  produced  by  £. 

At  this  point  we  cycle  through  all  values  of  l\  and  fix  the  first  permutation  (and  also  the 
associated  reflector)  to  obtain  a  new  Enigma  machine  £2  which  allows  us  to  decrypt  more  of  the 
long  message.  If  a  long  enough  message  is  obtained  we  can  also  obtain  I2  in  this  way,  or  alternatively 
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wait  for  another  day  when  the  rotor  order  is  changed.  Thus  the  entire  internal  workings  of  the 
Enigma  machine  can  be  determined. 


8.6.  Determining  the  Day  Settings 

Having  now  determined  the  internal  wirings,  given  the  set  of  two  months  of  day  settings  obtained 
by  Bertrand,  the  next  task  is  to  determine  the  actual  key  when  the  day  settings  are  not  available. 
At  this  stage  we  assume  that  the  German  operators  are  still  using  the  “encrypt  the  message  setting 
twice”  routine.  The  essential  trick  here  is  to  notice  that  if  we  write  the  cipher  as 


then 


O  =  r  •  7 j  '  t, 


ej  •  0+3  =  r  •  7j  •  7j+3  •  T . 

So  ej  •  eJ+3  is  conjugate  to  7 j  •  7j+3  and  so  by  Theorem  8.1  they  have  the  same  cycle  structure. 
More  importantly  the  cycle  structure  does  not  depend  on  the  plugboard  r. 

Hence,  if  we  can  use  the  cycle  structure  to  determine  the  rotor  settings  then  all  that  remains  is  to 
determine  the  plugboard  settings.  From  the  rotor  settings  we  know  the  values  of  7^,  for  j  =  1, . . . ,  6; 
from  the  encrypted  message  keys  we  can  compute  €j  for  j  =  1, . . . ,  6  as  in  the  previous  section. 
Hence,  the  plugboard  settings  can  be  recovered  by  solving  another  of  our  conjugacy  problems,  for 
r.  This  is  easier  than  before  as  we  have  that  r  must  be  a  product  of  disjoint  transpositions. 

We  have  already  discussed  how  to  compute  ej  •  eJ+3  from  the  encryption  of  the  message  keys. 
Hence,  we  simply  compute  these  values  and  compare  their  cycle  structures  with  those  obtained  by 
running  through  all  possible 

60  •  263  •  263  =  18  534  946  560 

choices  for  the  rotors,  positions  and  ring  settings!  Note  that  when  this  was  done  by  the  Polish 
analysts  in  the  1930s  there  was  only  a  choice  of  the  ordering  of  three  rotors.  The  extra  choice  of 
rotors  did  not  come  in  till  a  bit  later.  Hence,  the  total  choice  was  10  times  less  than  this  figure. 

The  above  simplifies  further  if  we  assume  that  no  stepping  of  the  second  and  third  rotor  occurs 
during  the  calculation  of  the  first  six  ciphertext  characters.  Recall  this  happens  around  seventy 
seven  percent  of  the  time.  In  such  a  situation  the  cycle  structure  depends  only  on  the  rotor  order 
and  the  difference  pi  —  ri  between  the  starting  rotor  position  and  the  ring  setting.  Hence,  we  might 
as  well  assume  that  r\  —  7*2  —  r%  =  0  when  computing  all  of  the  cycle  structures.  So,  for  seventy 
seven  percent  of  days  our  search  amongst  the  cycle  structures  is  then  only  among 

60  •  263  =  1054560  (resp.  105456) 


possible  cycle  structures. 

After  the  above  procedure  we  have  determined  all  values  of  the  initial  day  setting  bar  pi  and 
77,  however  we  know  the  differences  pi  —  77.  We  also  know  for  any  given  message  the  message  key 
PiiP2iP3-  Hence,  in  breaking  the  actual  message  we  only  require  the  solution  for  77,7*2;  the  value 
for  7*3  is  irrelevant  as  the  third  rotor  never  moves  a  fourth  rotor.  Most  German  messages  started 
with  the  same  two- letter  word  followed  by  a  space  (space  was  encoded  by  CX’).  Hence,  we  only  need 
to  go  through  262  different  positions  to  get  the  correct  ring  setting.  Actually  one  goes  through  262 
wheel  positions  with  a  fixed  ring,  and  uses  the  differences  to  infer  the  true  ring  settings. 

Once  is  determined  from  one  message  the  value  of  pi  can  be  determined  for  the  day  key 
and  then  all  messages  can  be  trivially  broken.  Another  variant  here,  if  a  suitable  piece  of  known 
plaintext  can  be  deduced,  is  to  apply  the  technique  from  Section  8.3.2  with  the  obvious  modification 
to  deduce  the  ring  settings  as  well. 
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8.7.  The  Germans  Make  It  Harder 


In  September  1938  the  German  operators  altered  the  way  that  day  and  message  keys  were  used. 
Now  a  day  key  consisted  of  a  rotor  order,  the  ring  settings  and  the  plugboard.  But  the  rotor 
positions  were  not  part  of  the  day  key.  A  cipher  operator  would  now  choose  their  own  initial  rotor 
positions,  say  AXE  and  their  own  message  rotor  positions,  say  GPL  The  operator  would  put  their 
machine  in  the  AXE  setting  and  then  encrypt  GPI  twice  as  before,  to  obtain  say  POWKNP. 
The  rotors  would  then  be  placed  in  the  GPI  position  and  the  message  would  be  encrypted.  The 
message  header  would  be  AXE  POWKNP. 

This  procedure  makes  the  analysis  of  the  previous  section  useless,  as  each  message  would  now 
have  its  own  “day”  rotor  position  setting,  and  so  one  could  not  collect  data  from  many  messages 
so  as  to  recover  eo  •  63  etc.  as  in  the  previous  section. 

What  was  needed  was  a  new  way  of  characterizing  the  rotor  positions.  The  strategy  invented 
by  Zygalski  was  to  use  so-called  “females” .  In  the  six  letters  of  the  enciphered  message  key  a  female 
is  the  occurrence  of  the  same  letter  in  the  same  position  in  each  substring  of  three.  For  example, 
the  header  POWKNP  contains  no  females,  but  the  header  POWPNL  contains  one  female  in 
position  zero,  i.e.  the  repeated  values  of  P,  separated  by  three  positions. 

To  consider  what  is  implied  by  the  existence  of  such  females,  firstly  suppose  we  receive  POWPNL 
as  above  and  that  the  unknown  first  key  setting  is  x.  Then  we  have  that,  if  g  represents  the  Enigma 
machine  in  the  day  setting, 


that  is 


rre  0  -  ryXi  -  2D 


peo-63  _  po-eo-63  _  pi3  _  p 


In  other  words  P  is  a  fixed  point  of  the  permutation  eo  •  63. 

Since  the  number  of  fixed  points  is  a  feature  of  the  cycle  structure  and  the  cycle  structure  is 
invariant  under  conjugation,  we  see  that  the  number  of  fixed  points  of  eo  •  €3  is  the  same  irrespective 
of  the  plugboard  setting. 

The  use  of  such  females  was  made  easier  by  so-called  Zygalski  sheets.  The  following  precompu¬ 
tation  was  performed,  for  each  rotor  order.  An  Enigma  machine  was  set  up  with  rings  in  position 
AAA  and  then,  for  each  position  d  to  Z  of  the  third  (leftmost)  rotor  a  sheet  was  created.  This 
sheet  was  a  table  of  51  by  51  squares,  consisting  of  the  letters  of  the  alphabet  repeated  twice  in 
each  direction  minus  one  row  and  column.  A  square  was  removed  if  the  Enigma  machine  with  first 
and  second  rotor  with  that  row/column  position  had  a  fixed  point  in  the  permutation  eo  •  63.  So 
for  each  rotor  order  there  was  a  set  of  26  sheets. 

Note,  we  are  going  to  use  the  sheets  to  compute  the  day  ring  setting,  but  the  computation  is 
done  using  different  rotor  positions  but  with  a  fixed  ring  setting.  This  is  because  it  is  easier  with 
an  Enigma  machine  to  rotate  the  rotor  positions  than  to  change  the  ring  settings.  Then  converting 
between  ring  and  rotor  settings  is  simple. 

In  fact,  it  makes  sense  to  also  produce  a  set  of  sheets  for  the  permutation  e\  •  64  and  62  •  65, 
as  without  these  the  number  of  keys  found  by  the  following  method  is  quite  large.  Hence,  for 
each  rotor  order  we  will  have  26  x  3  perforated  sheets.  We  now  describe  the  method  used  by  the 
Polish  analysts  when  only  three  rotors  were  used  (extending  it  to  five  rotors  is  simple  but  was  time 
consuming  at  the  time).  We  proceed  via  an  example.  Suppose  a  set  of  message  headers  are  received 
in  one  day.  From  these  we  keep  all  those  which  possess  a  female  in  the  part  corresponding  to  the 
encryption  of  the  message  key.  For  example,  we  obtain  the  following  message  headers: 


HUXTBPGNP 

BILJWWRRW 

YXMHCUHHR 

RELCOYXOF 

MIPVRYVCR 


DYRHFLGFS 

QYRZXOZJV 

FUGWINCIA 

XNEDLLDHK 

MQYVVPVKA 


XTMRSZRCX 

SZYJPFBPY 

BNAXGHFGG 

MWCQOPQVN 

TQNJSSIQS 


YGZVQWZQH 

MWIBUMWRM 

TLCXYUXYC 

AMQCZQCTR 

KHMCKKCIL 
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LQUXIBFIV 

DSJVDFVTT 

UPWMQJMSA 

DKQKFEJVE 


NXRZNYXNV 
HOMFCSQCM 
CQ JEHOVBO 
SHB0GI0QQ 


AMUIXVVFV 

ZSCTTETBH 

VELVUOVDC 

QWMUKBUVG 


UROVRUAWU 
S JECXKCFN 
TXGHFDJFZ 


Now  assuming  a  given  rotor  order,  say  the  rightmost  rotor  is  rotor  7,  the  middle  one  is  rotor  77 
and  the  leftmost  rotor  is  rotor  III ,  we  remove  all  those  headers  which  could  have  had  a  stepping 
action  of  the  middle  rotor  in  the  first  six  encryptions.  To  compute  these  we  take  the  third  character 
of  the  above  message  headers,  i.e.  the  position  pi  of  the  rightmost  rotor  in  the  encryption  of  the 
message  key,  and  the  position  of  the  notch  on  the  rightmost  rotor  assuming  the  rightmost  rotor  is 
7,  i.e.  i  =  16,  i.e.  Q.  We  compute  the  value  of  mi  according  to  Section  8.2 


and  remove  all  those  for  which 


mi  =  7i i  —  pi  —  1  (mod  26). 


[ (j  -  Till  +  26) / 26J  ^  0  for  j  =  0, 1,  2,  3, 4,  5. 


This  leaves  us  with  the  following  message  headers 


HUXTBPGNP 
SZYJPFBPY 
TLCXYUXYC 
MQYVVPVKA 
DSJVDFVTT 
CQ JEHOVBO 


DYRHFLGFS 

MWIBUMWRM 

XNEDLLDHK 

LQUXIBFIV 

ZSCTTETBH 

TXGHFDJFZ 


YGZVQWZQH 
FUGWINCIA 
MWCQOPQVN 
NXRZNYXNV 
S JECXKCFN 
DKQKFEJVE 


QYRZXOZJV 

BNAXGHFGG 

AMQCZQCTR 

AMUIXVVFV 

UPWMQJMSA 

SHB0GI0QQ 


We  now  consider  each  of  the  three  sets  of  females 


in  turn.  For  ease  of  discussion  we  only  consider 


those  corresponding  to  eo  •  £3.  We  therefore  only  examine  those  message  headers  which  have  the 
same  letter  in  the  fourth  and  seventh  positions,  i.e. 


QYRZXOZJV  TLCXYUXYC  XNEDLLDHK  MWCQOPQVN 

AMQCZQCTR  MQYVVPVKA  DSJVDFVTT  ZSCTTETBH 

S JECXKCFN  UPWMQJMSA  SHB0GI0QQ 

We  now  perform  the  following  operation,  for  each  letter  P3.  We  take  the  Zygalski  sheet  for  rotor 
order  777,  77,  7,  permutation  eo  •  63  and  letter  P3  and  we  place  this  down  on  the  table.  We  think 
of  this  first  sheet  as  corresponding  to  the  ring  setting 


r3  =  Q  -  Q  =  A, 


where  the  Q  comes  from  the  first  letter  in  the  first  message  header,  QYRZXOZJV.  Each  row  r  and 
column  c  of  the  first  sheet  corresponds  to  the  ring  setting 


7*1  =  R  —  r, 
r‘2  =  Y  —  c. 


We  now  repeat  the  following  process  for  each  message  header  with  a  first  letter  which  we  have  not 
met  before.  We  take  the  first  letter  of  the  next  message  header,  TLCXYUXYC,  in  this  case  T,  and  we 
take  the  sheet  with  label 

Ps  +  T-Q. 

This  sheet  then  has  to  be  placed  on  top  of  the  other  sheets  at  a  certain  offset  to  the  original  sheet. 
The  offset  is  computed  by  taking  the  top  leftmost  square  of  the  new  sheet  and  placing  it  on  top  of 
the  square  (r,  c)  of  the  first  sheet  where 

r  =  R-C, 
c  =  Y-L , 

i.e.  we  take  the  difference  between  the  third  (resp.  second)  letter  of  the  new  message  header  and 
the  third  (resp.  second)  letter  of  the  first  message  header. 
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This  process  is  repeated  until  all  of  the  given  message  headers  are  used  up.  Any  square  which 
is  now  clear  on  all  sheets  then  gives  a  possible  setting  for  the  rings  for  that  day.  The  actual  setting 
can  be  read  off  the  first  sheet  using  the  correspondence  above. 

This  process  will  give  a  relatively  large  number  of  possible  ring  settings  for  each  possible  rotor 
order.  However,  when  we  intersect  the  possible  values  obtained  from  considering  the  females  in  the 
0/3  position  with  those  in  the  1/4  and  the  2/5  positions  we  find  that  the  number  of  possibilities 
shrinks  dramatically.  Often  this  allows  us  to  uniquely  determine  the  rotor  order  and  ring  setting 
for  the  day.  We  determine  in  our  example  that  the  rotor  order  is  given  by  ///,  II  and  /,  with  ring 
settings  given  by  rq  =  A,  rq  —  B  and  rq  =  C. 

To  determine  the  plugboard  settings  for  the  day  we  can  either  use  a  piece  of  known  plaintext  as 
before,  or,  if  no  such  text  is  available,  we  can  use  the  females  to  help  drastically  reduce  the  number 
of  possibilities  for  the  plugboard  settings. 

8.8.  Known  Plaintext  Attack  and  the  Bombes 

Turing  (among  others)  wanted  a  technique  to  break  Enigma  which  did  not  rely  on  the  way  the 
German  military  used  the  system,  which  could  and  did  change.  Turing  settled  on  a  known  plaintext 
attack,  using  what  was  known  at  the  time  as  a  “crib” .  A  crib  was  a  piece  of  plaintext  which  was 
suspected  to  lie  in  the  given  piece  of  ciphertext. 

The  methodology  of  this  technique  was,  from  a  given  piece  of  ciphertext  and  a  suspected  piece 
of  corresponding  plaintext,  to  first  deduce  a  so-called  “menu”.  A  menu  is  simply  a  graph  which 
represents  the  various  relationships  between  ciphertext  and  plaintext  letters.  Then  the  menu  was 
used  to  program  an  electro-mechanical  device,  called  a  Bombe.  A  Bombe  was  a  device  which 
enumerated  the  Enigma  wheel  positions  and,  given  the  data  in  the  menu,  deduced  the  possible 
settings  for  the  rotor  orders,  wheel  positions  and  some  of  the  plugboard.  Finally,  the  ring  positions 
and  the  remaining  parts  of  the  plugboard  needed  to  be  found. 

In  the  following  we  present  a  version  of  this  technique  which  we  have  deduced  from  various 
sources.  We  follow  a  running  example  through  so  as  to  explain  the  method  in  more  detail. 

From  Ciphertext  to  a  Menu:  Suppose  we  receive  the  following  ciphertext 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 

and  that  we  know,  for  example  because  we  suspect  it  to  be  a  shipping  forecast,  that  the  ciphertext 
encrypts  at  some  point  the  plaintext1 

DOGGERF I SHERGERM ANB I GHTE AST 

Now,  we  know  that  in  the  Enigma  machine,  a  letter  cannot  decrypt  to  itself.  This  means  that  there 
are  only  a  few  positions  for  which  the  plaintext  will  align  correctly  with  the  ciphertext.  Consider 
the  following  alignment: 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
-DOGGERFISHERGERMANBIGHTEAST - 

then  we  see  that  this  is  impossible  since  the  S  in  the  plaintext  FISHER  cannot  correspond  to  the  S 
in  the  ciphertext.  Continuing  in  this  way  we  find  that  there  are  only  six  possible  alignments  of  the 
plaintext  fragment  with  the  ciphertext: 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 

DOGGERF I SHERGERMANB I GHTEAST - 

—DOGGERFISHERGERMANBIGHTEAST - 

- DOGGERF I SHERGERMANB I GHTEAST - 

- DOGGERF I SHERGERMANB I GHTEAST - 

- DOGGERF  I SHERGERMANB  I  GHTEAST - 

1This  plaintext  refers  to  sea  regions  in  the  BBC  shipping  weather  forecast. 
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- DOGGERFISHERGERMANBIGHTEAST 

In  the  following  we  will  focus  on  the  first  alignment,  i.e.  we  will  assume  that  the  first  ciphertext 
letter  H  decrypts  to  D  and  so  on.  In  practice  the  correct  alignment  out  of  all  the  possible  ones 
would  need  to  be  deduced  by  skill,  judgement  and  experience.  However,  in  any  given  day  a  number 
of  such  cribs  would  be  obtained  and  so  only  the  most  likely  ones  would  be  accepted  for  use  in  the 
following  procedure. 

As  is  usual  with  all  our  techniques  there  is  a  problem  if  the  middle  rotor  turns  over  in  the 
part  of  the  ciphertext  which  we  are  considering.  Our  piece  of  chosen  plaintext  is  27  letters  long,  so 
we  could  treat  it  in  two  sections  of  13  letters  (and  drop  the  last  letter).  The  advantage  of  this  is 
that  we  know  the  middle  rotor  will  only  advance  once  every  26  turns  of  the  first  rotor.  Hence,  by 
selecting  two  groups  of  roughly  13  letters  we  can  obtain  two  possible  alignments,  one  of  which  we 
know  does  not  contain  a  middle  rotor  movement. 

We  therefore  concentrate  on  the  following  two  alignments: 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 

DOGGERFISHERG - 

- ERM ANB I GHTE AS - 

We  now  deal  with  each  alignment  in  turn  and  examine  the  various  pairs  of  letters.  We  note  that  if 
H  encrypts  to  D  in  the  first  position  then  D  will  encrypt  to  H  in  the  same  Enigma  configuration. 
We  make  a  record  of  the  letters  and  the  positions  for  which  one  letter  encrypts  to  the  other.  These 
are  placed  in  a  graph  with  vertices  corresponding  to  the  letters  and  edges  being  labelled  by  the 
positions  of  the  related  encryptions.  This  results  in  the  two  graphs  (or  menus)  given  in  Figures  8.2 
and  8.3: 


Figure  8.2.  Menu  1 

These  menus  tell  us  a  lot  about  the  configuration  of  the  Enigma  machine,  in  terms  of  its 
underlying  permutations.  Each  menu  is  then  used  to  program  a  Bombe.  In  fact  we  program  one 
Bombe  not  only  for  each  menu,  but  also  for  each  possible  rotor  order.  Thus  if  five  rotor  orders  are 
in  use,  we  need  to  program  2  •  60  =  120  such  Bombes. 

8.8.1.  The  Turing/ Welchman  Bombe:  There  are  many  descriptions  of  the  Bombe  as  an  elec¬ 
trical  circuit.  In  the  following  we  present  the  basic  workings  of  the  Bombe  in  terms  of  a  modern 
computer;  note  however  that  in  practice  this  is  not  very  efficient.  The  Bombe’s  electrical  circuit 
was  able  to  execute  the  basic  operations  at  the  speed  of  light  (i.e.  the  time  it  takes  for  a  current 
to  pass  around  a  circuit),  hence  simulating  this  with  a  modern  computer  is  very  inefficient.  The 
Bombe  was  in  fact  an  electro-mechanical  computer  which  was  designed  to  perform  a  specific  task 
-  namely  determining  the  wheel  settings  of  the  Enigma  machine  given  one  of  the  menus  described 
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Figure  8.3.  Menu  2 


above.  The  initial  design  was  made  by  Turing,  but  Welchman  contributed  a  vital  component  (called 
the  diagonal  board)  which  made  the  process  very  efficient. 

I  have  found  that  the  best  way  to  think  of  the  Bombe  is  as  a  computer  with  26  registers,  each 
of  which  is  26  bits  in  length.  In  a  single  “step”  of  the  Bombe  a  single  bit  in  one  register  bank  is  set. 
Say  we  set  bit  F  of  register  this  corresponds  to  us  wishing  to  test  whether  F  is  plugged  to  H  in 
the  actual  Enigma  configuration.  This  testing  is  done  with  the  wheels  in  a  given  specific  location. 
The  Bombe  then  passes  through  a  series  of  states  until  it  stabilizes.  In  the  actual  Bombe  this 
occurs  at  the  speed  of  light;  in  a  modern  computer  simulation  it  needs  to  be  actually  programmed 
and  so  occurs  at  the  speed  of  a  computer.  Once  the  register  bank  stabilizes,  each  bit  that  is  set 
means  that  if  the  tested  condition  is  true,  and  the  wheels  are  in  the  correct  position,  then  so  must 
this  condition  be  true,  i.e.  if  bit  J  of  register  K  is  set  then  J  should  be  plugged  to  K  in  the  Enigma 
machine.  If  we  obtain  a  contradiction,  then  either  the  initial  tested  condition  is  false,  or  the  wheels 
are  in  the  wrong  position.  Continuing  in  this  way  with  different  wheel  settings  we  either  deduce 
the  correct  wheel  settings,  or  deduce  that  the  initial  tested  condition  is  false.  Whilst  the  test  of 
a  specific  configuration  in  the  real  Bombe  happens  at  the  speed  of  light,  the  moving  to  the  next 
wheel  setting  happens  mechanically.  Thus  despite,  the  test  being  faster  on  the  Bombe  than  on  a 
modern  computer,  the  modern  computer  can  outperform  the  Bombe  for  the  overall  algorithm. 

In  other  words  the  Bombe  deduces  a  “Theorem”  of  the  form 

If  (F  -A  H  and  the  wheels  are  in  position  V)  Then  K  -A  J. 

With  this  interpretation  the  diagonal  board  detailed  in  descriptions  of  the  Bombe  is  then  the 
obvious  condition  that  if  K  is  plugged  to  J,  then  J  is  also  plugged  to  iT,  i.e.  if  bit  J  of  register  K 
is  set,  then  so  must  be  bit  K  of  register  J.  In  the  real  Bombe  this  is  achieved  using  wires,  however 
in  a  computer  simulation  it  means  that  we  always  set  the  “transpose”  bit  when  setting  any  bit  in 
our  register  bank.  Thus,  the  register  bank  is  symmetric  down  the  leading  diagonal.  The  diagonal 
board  drastically  increases  the  usefulness  of  the  Bombe  in  breaking  arbitrary  cribs. 
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To  understand  how  the  menu  acts  on  the  set  of  registers  we  define  the  following  permutation 
for  0  <  i  <  263,  for  a  given  choice  of  rotors  pi,  p2  and  p%.  We  write  i  =  i\  +  %2  •  26  +  is  •  262,  and 
dehne 

$i,s  =  (crn+s+1  •  Pi  •  cr_*1_s_1)  •  (a‘2  ■  P2  ■  U~n)  ■  (a13  ■  p3  •  a~Vi)  ■  g 

•(<A  •  p3— 1  •  a~i3)  ■  (cr*2  •  p2— 1  •  a~i2)  ■  (<7*1+s+1  •  pr1  •  cr -ii-*-1). 

Note  how  similar  this  is  to  the  equation  of  the  Enigma  machine.  The  main  difference  is  that  the 
second  and  third  rotors  cycle  through  at  a  different  rate  (depending  only  on  i).  The  variable  i 
is  used  to  denote  the  rotor  position  which  we  currently  with  to  test  and  the  variable  s  is  used  to 
denote  the  action  of  the  menu,  as  we  shall  now  describe. 

The  menu  acts  on  the  registers  as  follows:  for  each  link  x  A  y  in  the  menu  we  take  register  x 
and  for  each  set  bit  xz  we  apply  to  obtain  xw.  Then  the  bit  xw  is  set  in  register  y  and  (due  to 
the  diagonal  board)  bit  y  is  set  in  register  xw.  We  also  need  to  apply  the  link  backwards,  so  for 
each  set  bit  yz  in  register  y  we  apply  5^s  to  obtain  yw.  Then  bit  yw  is  set  in  register  x  and  (again 
due  to  the  diagonal  board)  bit  x  is  set  in  register  yw. 

We  now  let  l  denote  the  letter  which  satisfies  at  least  one  of  the  following,  and  we  hope  all 
three: 

(1)  A  letter  which  occurs  more  often  than  any  other  letter  in  the  menu. 

(2)  A  letter  which  occurs  in  more  cycles  than  any  other  letter. 

(3)  A  letter  which  occurs  in  the  largest  connected  component  of  the  graph  of  the  menu. 

In  the  above  two  menus  we  several  letters  to  choose  from  in  Menu  1,  so  we  select  l  =  S;  in  Menu  2 
we  select  l  =  E.  For  each  value  of  i  we  then  perform  the  following  operation 

•  Unset  all  bits  in  the  registers. 

•  Set  bit  l  of  register  l. 

•  Keep  applying  the  menu,  as  above,  until  the  registers  no  longer  change  at  all. 

Hence,  the  above  algorithm  is  working  out  the  consequences  of  the  letter  l  being  plugged  to  itself, 
given  the  choice  of  rotors  pi,p2  and  p$.  It  is  the  third  line  in  the  above  algorithm  which  operates 
at  the  speed  of  light  in  the  real  Bombe.  In  a  modern  simulation  this  takes  a  lot  longer. 

After  the  registers  converge  to  a  steady  state  we  then  test  them  to  see  whether  a  possible  value 
of  i,  i.e.  a  possible  value  of  the  rotor  position  has  been  found.  We  then  step  i  on  by  one,  which  in 
the  real  Bombe  is  achieved  by  rotating  the  rotors,  and  repeat.  A  value  of  i  which  corresponds  to  a 
valid  value  of  i  is  called  a  “Bombe  Stop” . 

To  see  what  is  a  valid  value  of  i,  suppose  we  have  the  rotors  in  the  correct  positions.  If  the 
plugboard  hypothesis  that  the  letter  l  is  plugged  to  itself  is  true,  then  the  registers  will  converge 
to  a  state  which  gives  the  plugboard  settings  for  the  registers  in  the  graph  of  the  menu  which 
are  connected  to  the  letter  l.  If,  however,  the  plugboard  hypothesis  is  wrong  then  the  registers 
will  converge  to  a  different  state,  in  particular  the  bit  of  each  register  which  corresponds  to  the 
correct  plugboard  configuration  will  never  be  set.  The  best  we  can  then  expect  is  that  this  wrong 
hypothesis  propagates  and  all  registers  in  the  connected  component  become  set  with  25  bits.  The 
one  remaining  unset  bit  then  corresponds  to  the  correct  plugboard  setting  for  the  letter  l.  If  the 
rotor  position  is  wrong  then  it  is  highly  likely  that  all  the  bits  in  the  test  register  l  converge  to  the 
set  position. 

To  summarize,  we  have  one  of  the  following  situations  upon  convergence  of  the  registers  at  step 
i: 

•  All  26  bits  of  test  register  l  are  set.  This  implies  that  the  rotors  are  not  in  the  correct 
position  and  we  can  step  on  i  by  one  and  repeat  the  whole  process. 

•  One  bit  of  test  register  l  is  set,  the  rest  being  unset.  This  is  a  possible  correct  configuration 
for  the  rotors.  If  it  is  indeed  the  correct  configuration  then,  in  addition,  the  set  bit 
corresponds  to  the  correct  plug  setting  for  register  Z,  and  the  single  bit  set  in  the  registers 
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corresponding  to  the  letters  connected  to  l  in  the  menu  will  give  us  the  plug  settings  for 
those  letters  as  well. 

•  One  bit  of  the  test  register  l  is  unset,  the  rest  being  set.  This  is  also  a  possible  correct 
configuration  for  the  rotors.  If  it  is  indeed  the  correct  configuration  then,  in  addition,  the 
unset  bit  corresponds  to  the  correct  plug  setting  for  register  /,  and  any  single  unset  bit  in 
the  registers  corresponding  to  the  letters  connected  to  l  in  the  menu  will  give  us  the  plug 
settings  for  those  letters  as  well. 

•  The  number  of  set  bits  in  register  l  lies  in  [2, ... ,  24].  These  are  relatively  rare  occurrences, 
and  although  they  could  correspond  to  actual  rotor  settings  they  tell  us  little  directly  about 
the  plug  settings.  The  problem  could  be  because  the  initial  plug  hypothesis  is  false.  For 
“good”  menus  we  find  that  stops  like  this  are  very  rare  indeed. 

A  Bombe  stop  is  a  position  where  the  machine  decides  that  it  has  reached  a  possibly  correct 
configuration  of  the  rotors.  The  number  of  such  stops  per  rotor  order  depends  on  the  structure 
of  the  graph  of  the  menu.  Turing  determined  the  expected  number  of  stops  for  different  types  of 
menus.  The  following  table  shows  the  expected  number  of  stops  per  rotor  order  for  a  connected 
menu  (i.e.  only  one  component)  with  various  numbers  of  letters  and  cycles. 


Number  of  Letters 

Cycles 

8 

9 

10 

11 

12 

13 

14 

15 

16 

3 

2.2 

1.1 

0.42 

0.14 

0.04 

«  0 

0 

«  0 

0 

2 

58 

28 

11 

3.8 

1.2 

0.3 

0.06 

«  0 

«  0 

1 

1500 

720 

280 

100 

31 

7.7 

1.6 

0.28 

0.04 

0 

40000 

19000 

7300 

2700 

820 

200 

43 

7.3 

1.0 

This  also  gives  an  upper  bound  on  the  expected  number  of  stops  for  an  unconnected  menu  in 
terms  of  the  size  of  the  largest  connected  component  and  the  number  of  cycles  within  the  largest 
connected  component. 

Hence,  a  good  menu  is  not  only  one  which  has  a  large  connected  component  but  which  also 
has  a  number  of  cycles.  Our  second  example  menu  is  particularly  poor  in  this  respect.  Note  that 
a  large  number  of  letters  in  the  connected  component  not  only  reduces  the  expected  number  of 
Bombe  stops  but  also  increases  the  number  of  deductions  about  possible  plugboard  configurations. 

8.8.2.  Bombe  Stop  to  Plugboard:  We  now  need  to  work  out  how  from  a  Bombe  stop  we  can 
either  deduce  the  actual  key,  or  deduce  that  the  stop  has  occurred  simply  by  chance  and  does  not 
correspond  to  a  correct  configuration.  We  first  sum  up  how  many  stops  there  are  in  our  example 
above.  For  each  menu  we  specify,  in  the  following  table,  the  number  of  Bombe  stops  which  arise 
and  we  also  specify  the  number  of  bits  in  the  test  register  l  which  gave  rise  to  the  stop. 


Number  of  Bits  Set 

Menu 

1 

2 

3 

4 

5-20 

21 

22 

23 

24 

25 

1 

137 

0 

0 

0 

0 

0 

0 

0 

9 

1551 

2 

2606 

148 

9 

2 

0 

2 

7 

122 

2024 

29142 

Here  we  can  see  the  effect  of  the  difference  in  size  of  the  largest  connected  component.  In  both 
menus  the  largest  connected  component  has  a  single  cycle  in  it;  in  both  cases  being  an  edge  with 
two  different  labels.  For  the  first  menu  we  obtain  a  total  of  1697  stops,  or  28.3  stops  per  rotor 
order.  The  connected  component  has  eleven  letters  in  it,  so  this  yield  is  much  better  than  the  yield 
expected  from  Turing’s  earlier  table.  This  is  due  to  the  extra  two-letter  component  in  the  graph  of 
Menu  One.  For  Menu  Two  we  obtain  a  total  of  34  062  stops,  or  567.7  stops  per  rotor  order.  The 
connected  component  in  the  second  menu  has  six  letters  in  it,  so  although  this  figure  is  bad  it  is  in 
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fact  better  than  the  maximum  expected  from  Turing’s  table.  Again  this  is  due  to  the  presence  of 
other  components  in  the  graph. 

Given  the  large  number  of  stops  we  need  a  way  of  automating  the  checking  process.  It  turns 
out  that  this  is  relatively  simple  as  the  state  of  the  registers  allow  other  conditions  to  be  checked 
automatically.  Recall  that  the  Bombe  stop  also  gives  us  information  about  the  state  of  the  supposed 
plugboard.  The  following  are  so-called  “legal  contradictions”,  which  can  be  eliminated  instantly 
from  the  above  stops,  assuming  the  initial  plug  supposition  is  correct: 

•  If  any  Bombe  register  has  26  bits  set  then  this  Bombe  configuration  is  impossible. 

•  If  the  Bombe  registers  imply  that  a  letter  is  plugged  to  two  different  letters  then  this  is 
clearly  a  contradiction. 

Suppose  we  know  that  the  plugboard  uses  a  certain  number  of  plugs  (in  our  example  this  number 
is  ten);  if  the  registers  imply  that  there  are  more  than  this  number  of  plugs  then  this  is  also 
a  contradiction.  Applying  these  conditions  means  we  are  down  to  only  19  750  possible  Bombe 
stops  out  of  the  35  759  total  stops  above.  Of  these,  109  correspond  to  the  first  menu  and  the  rest 
correspond  to  the  second  menu. 

We  clearly  cannot  cope  with  all  of  those  corresponding  to  the  second  menu  so  let’s  suppose 
that  the  second  rotor  does  not  turn  over  in  the  first  thirteen  characters.  This  means  we  now  only 
need  to  focus  on  the  first  menu.  In  practice  a  number  of  configurations  could  be  eliminated  due  to 
operational  requirements  imposed  on  the  German  operators  (e.g.  not  using  the  same  rotor  order 
on  consecutive  days). 

8.8.3.  Finding  the  Final  Part  of  the  Key:  We  will  focus  on  the  first  two  remaining  stops  for 
the  first  menu.  Both  of  these  correspond  to  rotor  orders  where  the  rightmost  (fastest)  rotor  is  rotor 
/,  the  middle  one  is  rotor  II  and  the  leftmost  rotor  is  rotor  III. 

The  first  remaining  stop  is  at  Bombe  configuration  i\  —  pi  —  r\  —  T,  12  =  P2  —  ^2  =  kb  and 
is  =  P3  ~  ^3  =  A".  These  follow  from  the  following  final  register  state  in  this  configuration  given 
in  Table  8.1,  where  rows  represent  registers  and  columns  the  bits.  The  test  register  S  has  25  bits 


A 

B 

c 

D 

E 

F 

G 

H 

I 

J 

K 

L 

M 

N 

O 

P 

Q 

R 

S 

T 

U 

V 
W 
X 

Y 
Z 


A 

~0 

0 

0 

1 

1 

0 

1 

1 

1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

1 

1 

0 

1 

1 

0 

0 

0 


B 

~0 

0 

0 

1 

1 

0 

1 

1 

1 

0 

0 

0 

0 

1 

1 

0 

0 

1 

1 

1 

1 

1 

1 

0 

0 

0 


C  D 
0  1 
0  1 
0  0 
0  1 
1  1 
1  1 
1  1 
1  1 
1  1 
0  1 
0  1 
0  1 
0  1 
1  1 
0  1 
0  1 
0  1 
1  1 
1  1 
1  1 
0  1 
1  1 
1  1 
1  1 
0  1 
0  1 


E  F  G  H  I 

I  0  i  i  1 

10111 

11111 

11111 

11110 

11111 
11011 
11111 
01111 
10101 
10111 
11111 
11111 
10111 
10111 
10111 
10111 
11111 
11111 
11111 
10111 
11111 
11111 
11111 
11111 
10111 

Table  8.1. 


J  K  L  M  N 

0  0  0  0  T 

0  0  0  0  1 

0  0  0  0  1 

11111 
11111 
0  0  110 
11111 
0  1111 
11111 
0  0  0  0  1 

0  0  0  0  1 

0  0  0  0  1 

0  0  0  0  1 

11111 
1110  1 
0  0  0  0  1 

0  0  0  0  1 

1110  1 
110  11 
11111 
11111 
11111 
11111 
0  0  0  0  1 

0  0  0  0  1 

0  0  0  0  1 

The  registers  on 


O  P  Q  R  S 

0  0  0  i  1 

10011 
00011 
11111 
11111 
00011 
11111 
11111 
11111 
10011 
10011 

10010 
00001 
11111 
00111 
00011 

10011 
11111 
11111 
11111 
10111 
01111 
11111 
10011 
10011 
10011 

the  first  Bombe 


t  u 

—  0 

1  1 

1  0 

1  1 

1  1 

1  0 

1  1 

1  1 

1  1 

1  1 

1  1 

1  1 

1  1 

1  1 

1  1 

1  0 

1  1 

1  1 

1  1 

1  1 

1  0 

1  1 

1  0 

1  0 

1  1 

0  1 

stop 


V  W  X  Y 

T  1  0  o’ 

110  0 
1110 
1111 
1111 
1111 
1111 
1111 
1111 
110  0 
110  0 
110  0 
110  0 
1111 
0  111 
110  0 
110  0 
1111 
1111 
1111 
10  0  1 
1111 
1111 
1111 
1110 
110  0 


z 

o’ 

0 

0 

1 

1 

0 

1 

1 

1 

0 

0 

0 

0 

1 

1 

0 

0 

1 

1 

0 

1 

1 

1 

0 

0 

0 
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set,  so  in  this  configuration  each  bit  of  the  test  register  implies  that  this  letter  is  not  plugged  to 
another  letter.  The  plugboard  setting  is  deduced  to  contain  the  plugs 

C  GA  D,  E  GA  /,  F  CA  N,  H  GA  J,  L  CA  5, 

M  GA  R,  Ocb,  TcZ,  U  er  W, 

whilst  the  letter  G  is  known  to  be  plugged  to  itself,  assuming  this  is  the  correct  configuration. 

So  we  need  to  find  one  other  plug  and  the  ring  settings.  We  can  assume  that  77  =  0  =  A  as 
it  plays  no  part  in  the  actual  decryption  process.  Since  we  are  using  the  rotor  I  as  the  rightmost 
rotor  we  know  that  n i  =  16,  i.e.  Q,  which,  combined  with  the  fact  that  we  are  assuming  that  no 
stepping  occurs  in  the  first  thirteen  characters,  implies  that  pi  must  satisfy 

j  -  ((16  -pi-l)  (mod  26))  +  26  <  25  for  j  =  0, . . . ,  12. 

i.e.  pi  =  0, 1,  2, 16, 17, 18, 19,  20,  21,  22,  23,  24  or  25. 

With  the  Enigma  setting  of  p\  =  T,  p2  =  W,  ps  =  K  and  r\  =  7*2  =  7*3  =  A  and  the  above 
(incomplete)  plugboard  we  decrypt  the  fragment  of  ciphertext  and  compare  the  resulting  plaintext 
with  the  crib. 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
DVGGERLISHERGMWBRXZSWNVOMQOQKLKCSQLRRHPVCAG 
DOGGERF I SHERGERM ANB I GHTE AST - 

This  is  very  much  like  the  supposed  plaintext.  Examine  the  first  incorrect  letter,  which  occurs 
in  position  two.  This  error  cannot  be  due  to  a  second  rotor  turnover,  because  of  our  assumption, 
hence  it  must  be  due  to  a  missing  plugboard  element.  If  we  let  71  denote  the  current  approximation 
to  the  permutation  representing  the  Enigma  machine  for  letter  one  and  r  the  missing  plugboard 
setting  then  we  have 

U 71  =  V  and  Ur'11'r  =  O. 

This  implies  that  r  should  contain  either  a  plug  involving  the  letter  U  or  one  involving  the  letter 
O ;  but  both  of  these  letters  are  already  used  in  the  plugboard  output  from  the  Bombe.  Hence,  this 
configuration  must  be  incorrect. 

The  second  remaining  stop  is  at  Bombe  configuration  i\  =  p\  —  r\  —  —  P2  —  ^2  —  E  and 

is  =  P3  —  =  L.  The  plugboard  setting  is  deduced  to  contain  the  following  plugs 

D  ga  Q,  AgT,  F  ca  N,  I  ca  O,  80E,  TgI, 

whilst  the  letters  G,  H  and  R  are  known  to  be  plugged  to  themselves,  assuming  this  is  the  correct 
configuration.  These  follow  from  the  final  register  state  in  this  configuration  given  in  Table  8.2.  So 
we  need  to  find  four  other  plug  settings  and  the  ring  settings.  Again  we  can  assume  that  r%  =  A 
as  it  plays  no  part  in  the  actual  decryption  process,  and  again  we  deduce  that  p\  must  be  one  of 
0, 1,  2, 16, 17, 18, 19,  20,  21,  22,  23,  24  or  25. 

With  the  Enigma  setting  of  p\  =  R.  p-2  =  id,  p^  =  L  and  77  =  =  1*3  =  A  and  the  above 

(incomplete)  plugboard  we  decrypt  the  fragment  of  ciphertext  and  compare  the  resulting  plaintext 
with  the  crib. 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
D0GGERFISHERGNRAMNC0XHXZM0RIK0EDEYWEFEYMSDQ 
DOGGERF I SHERGERM ANB I GHTE AST - 

We  now  look  at  the  first  incorrect  letter,  occurring  in  the  14th  position.  Using  the  same  notation 
as  before,  i.e.  7 j  for  the  current  approximation  and  r  for  the  missing  plugs,  we  see  that  if  this 
incorrect  operation  is  due  to  a  plug  problem  rather  than  a  rotor  turnover  problem  then  we  must 
have 

QT-yiZ’T  _ 
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A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

K 

L 

M 

N 

O 

P 

Q 

R 

S 

T 

U 

V 
W 
X 

Y 
Z 


A 

A 

0 

0 

1 

1 

0 

1 

1 

1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

1 

1 

0 

1 

1 

0 

0 

0 


B 

T 

0 

0 

1 

1 

1 

1 

1 

1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

1 

1 

0 

1 

1 

1 

0 

0 


C  D 
0  1 
0  1 
0  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
0  1 
0  1 
0  1 
0  1 
1  1 
0  1 
0  1 
0  0 
1  1 
1  1 
1  1 
0  1 
1  1 
1  1 
0  1 
0  1 
0  1 


E  F  G  H 

T  0  I  1 

1111 
1111 
1111 
1111 
10  11 
110  1 
1110 
1111 
10  11 
10  11 
10  11 
10  11 
10  11 
1111 
10  11 
10  11 
1111 
1111 
0  111 
1111 
1111 
1111 
1111 
1111 
10  11 


I  J  K 

1  0  0 

10  0 
10  0 
111 
111 
10  0 
111 
111 
111 
1  0  0 

10  0 
10  0 
10  0 
111 
0  0  0 

1  0  0 

1  0  0 

111 
111 
111 
10  0 
111 
111 
1  1  0 

1  0  0 

10  0 


L  M  N 

0  0  1 

0  0  1 

0  0  1 

111 
111 
0  0  0 

111 
111 
111 
0  0  1 

0  0  1 

0  0  1 

0  0  1 

111 
111 
0  0  1 

0  0  1 

111 
111 
111 
111 
111 
111 
0  0  1 

0  0  1 

0  0  1 


O  P  Q 

0  0  0 

0  0  0 

0  0  0 

110 
111 
10  0 
111 
111 
0  1  1 

0  0  0 

0  0  0 

10  0 
10  0 
111 
1  1  0 

10  0 
0  0  0 

111 
111 
111 
110 
111 
111 
0  0  1 

0  0  0 

10  0 


R  S 

T  1 

1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
1  1 
0  1 
1  1 
1  1 
1  1 
1  0 
1  1 
1  1 
1  1 
1  1 


T  U 

T  0 

1  0 
1  0 
1  1 
0  1 
1  1 
1  1 
1  1 
1  1 
1  0 
1  0 
1  1 
1  1 
1  1 
1  1 
1  1 
1  0 
1  1 
1  1 
1  1 
1  0 
1  1 
1  1 
1  0 
1  0 
1  1 


Table  8.2.  The  registers  on  the  second  Bombe  stop 


v  w  x  Y 
1  I  0  o’ 

1110 
110  0 
1111 
1111 
1111 
1111 
1111 
1111 
1110 
110  0 

110  0 

110  0 

1111 
110  0 

110  0 

1110 
1111 
0  111 
1111 
110  0 

1111 
110  1 

10  11 
1110 
110  0 


z 

o’ 

0 

0 

1 

1 

0 

1 

1 

1 

0 

0 

0 

0 

1 

1 

0 

0 

1 

1 

1 

1 

1 

1 

0 

0 

0 


Now,  E  already  occurs  on  the  plugboard,  via  E  CA  T,  so  r  must  include  a  plug  which  maps  C  to 
the  letter  x  where 

x713  =  E. 


But  we  can  compute  that 

7!3  =  (AM)(BE)(CN)(DO)(FI)(GS)(HX)(JU)(KP)(LQ)(RV)(TY)(WZ), 


from  which  we  deduce  that  x  —  B.  So  we  include  the  plug  C  AT  B  in  our  new  approximation  and 
repeat  to  obtain  the  plaintext 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
D0GGERFISHERGERAMNB0XHXNM0RIK0EMEYWEFEYMSDQ 
D0GGERF I SHERGERMANB I GHTEAST - 

We  then  see  in  the  16th  position  that  we  either  need  to  step  the  rotor  or  there  should  be  a  plug 
which  means  that  S  maps  to  M  under  the  cipher.  We  have,  for  our  new  715  that 

7i5  =  (AS)(BJ)(CY)(DK)(EX)(FW)(GI)(HU)(LM)(NQ)(OP)(RV)(TZ). 


The  letter  S  already  occurs  in  a  plug,  so  we  must  have  that  A  is  plugged  to  M.  We  add  this  plug 
into  our  configuration  and  repeat 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
D0GGERFISHERGERMANB0XHXNA0RIKVEAEYWEFEYASDQ 
D0GGERF I SHERGERMANB I GHTEAST - 

Now  the  20th  character  is  incorrect:  we  need  P  to  map  to  I  and  not  O  under  the  cipher  in  this 
position.  Again  assuming  that  this  is  due  to  a  missing  plug  we  find  that 

7l9  =  (- AH)(BM)(CF)(DY)(EV)(GX)(IK)(JR)(LS)(NT)(OP)(QW)(UZ ). 


There  is  already  a  plug  involving  the  letter  I  so  we  deduce  that  the  missing  plug  should  be  K  CA  P. 
Again  we  add  this  new  plug  into  our  configuration  and  repeat  to  obtain 
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HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
DOGGERFISHERGERMANBIXHJNAORIPVXAEYWEFEYASDQ 
DOGGERF I SHERGERMANB I GHTEAST - 

Now  the  21st  character  is  wrong  as  we  must  have  that  L  maps  to  G.  We  know,  from  the  Bombe 
stop  configuration  that  G  is  plugged  to  itself,  and  given 

720  =  (. AI)(BJ)(CW)(DE)(FK)(GZ)(HU)(LX)(MQ)(NT)(OV)(PY)(RS ), 

we  deduce  that  if  this  error  is  due  to  a  plug  we  must  have  that  L  is  plugged  to  Z.  We  add  this 
final  plug  into  our  configuration  and  find  that  we  obtain 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
DOGGERFISHERGERMANBIGHJNAORIPVXAEYWEFEYAQDQ 
DOGGERF I SHERGERMANB I GHTEAST - 

All  the  additional  plugs  we  have  added  have  been  on  the  assumption  that  no  rotor  turnover  has 
yet  occurred.  Any  further  errors  must  be  due  to  rotor  turnover,  as  we  now  have  a  full  set  of  plugs 
(as  we  know  our  configuration  only  has  ten  plugs  in  use).  If  when  correcting  the  rotor  turnover  we 
still  do  not  decrypt  correctly  we  need  to  back  up  and  repeat  the  process. 

We  see  that  the  next  error  occurs  in  position  23.  This  means  that  a  rotor  turnover  must  have 
occurred  just  before  this  letter  was  encrypted.  In  other  words  we  have 

22  -  ((16  -pi-l)  (mod  26))  +  26  =  26. 

This  implies  that  pi  =  19,  i.e.  p\  —  T,  which  implies  that  r\  —  C.  We  now  try  to  decrypt  again, 
and  we  obtain 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
DOGGERFISHERGERMANBIGHTZWORIPVXAEYWEFEYAQDQ 
DOGGERF I SHERGERMANB I GHTEAST - 

But  we  still  do  not  have  the  correct  plaintext.  The  only  thing  which  could  have  happened  is  that 
we  have  had  an  incorrect  third  rotor  movement.  Rotor  II  has  its  notch  in  position  n 2  =  4,  i.e.  E. 
If  the  third  rotor  moved  on  at  position  24  then  we  have,  in  our  earlier  notation 

mi  =  ni-pi-l  (mod  26)  =  16  —  19  —  1  (mod  26)  =  22, 
m  =  rri2  —  P2  —  1  (mod  26)  =  4  —  P2  —  1  (mod  26), 
m 2  =  mi  +  1  +  26  •  m  =  23  +  26  •  m 
650  =  23  -m2+650 

This  last  equation  implies  that  m2  =  23,  which  implies  that  m  =  0,  which  itself  implies  that  P2  =  3, 
i.e.  P2  =  D.  But  this  is  exactly  the  setting  we  have  for  the  second  rotor.  So  the  problem  is  not 
that  the  third  rotor  advances,  it  is  that  it  should  not  have  advanced.  We  therefore  need  to  change 
this  to  say  P2  =  E  and  7*2  =  E>,  (although  this  is  probably  incorrect  it  will  help  us  to  decrypt  the 
fragment).  We  find  that  we  then  obtain 

HUSVTNXRTSWESCGSGVXPLQKCEYUHYMPBNUITUIHNZRS 
DOGGERFISHERGERMANBIGHTEASTFORCEFIVEFALLING 
DOGGERF I SHERGERMANB I GHTEAST - 

Hence,  we  can  conclude  that,  apart  from  a  possibly  incorrect  setting  for  the  second  ring  we  have 
the  correct  Enigma  setting  for  this  day. 

8.9.  Ciphertext  Only  Attack 

The  following  attack  allows  one  to  break  the  Enigma  machine  when  only  a  single  ciphertext  is 
given.  The  method  relies  on  the  fact  that  enough  ciphertext  is  given  and  that  a  full  set  of  plugs  is 
not  used.  Suppose  we  have  a  reasonably  large  amount  of  ciphertext,  say  500-odd  characters,  and 
that  p  plugs  are  in  use.  If  we  knew  the  rotor  positions,  around  ((26  —  2  •  p)/26)2  of  the  letters 
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would  decrypt  exactly,  as  these  letters  would  not  pass  through  a  plug  either  before  or  after  the 
rotor  stage.  Hence,  one  could  distinguish  the  correct  rotor  positions  by  using  some  statistic  to 
distinguish  a  random  plaintext  from  a  plaintext  in  which  ((26  —  2  -p)/ 26) 2  of  the  letters  are  correct. 

Gillogly2  suggests  using  the  index  of  coincidence.  To  obtain  this  statistic  we  count  the  frequency 
fi  of  each  letter  in  the  resulting  plaintext  of  length  n  and  compute 


z 


ic  =  Yj 

i—A 


fi  ■  (fi  - 1) 
n  •  (n  —  1) 


For  this  approach  we  set  the  rings  to  position  A,  A ,  A  and  then  run  through  all  possible  rotor  orders 
and  rotor  starting  positions.  For  each  setting  we  compute  the  resulting  plaintext  and  the  associated 
value  of  IC.  We  keep  those  settings  which  have  a  high  value  of  IC. 

Gillogly  then  suggests  for  the  settings  which  give  a  high  value  of  7C,  to  run  through  the 
associated  ring  settings  -  adjusting  the  starting  positions  as  necessary  -  with  a  similar  test.  The 
problem  with  this  approach  is  that  it  is  susceptible  to  the  effect  of  turnover  of  the  various  rotors. 
Either  a  rotor  could  turn  over  when  we  did  not  expect  it,  or  it  could  have  turned  over  by  error. 
This  is  similar  to  the  situation  we  obtained  in  our  example  using  the  Bombe  in  a  known  plaintext 
attack. 

Consider  the  following  ciphertext,  of  734  characters  in  length 

RSDZANDHWQJPPKOKYANQIGTAHIKPDFHSAWXDPSXXZMMAUEVYYRLWVFFTSDYQPS 
CXBLIVFDQRQDEBRAKIUVVYRVHGXUDNJTRVHKMZXPRDUEKRVYDFHXLNEMKDZEWV 
OFKAOXDFDHACTVUOFLCSXAZDORGXMBVXYSJJNCYOHAVQYUVLEYJHKKTYALQOAJ 
QWHYVVGLFQPTCDCAZXIZUOECCFYNRHLSTGJILZJZWNNBRBZJEEXAEATKGXMYJU 
GHMCJRQUODOYMJCXBRJGRWLYRPQNABSKSVNVFGFOVPJCVTJPNFVWCFUUPTAXSR 
VQDATYTTHVAWTQJPXLGBSIDWQNVHXCHEAMVWXKIUSLPXYSJDUQANWCBMZFSXWH 
JGNWKIOKLOMNYDARREPGEZKCTZNPQKOMJZSQHYEADZTLUPGBAVCVNJHXQKYILX 
LTHZXJKYFQEBDBQOHMXBTVXSRGMPVOGMVTEYOCQEOZUSLZDQZBCXXUXBZMZSWX 
OCIWRVGLOEZWVVOQ JXSFYKDQDXJZYNPGLWEEVZDOAKQOUOTUEBTCUTPYDHYRUS 
AOYAVEBJVWGZHGLHBDHHRIVIAUUBHLSHNNNAZWYCCOFXNWXDLJMEFZRACAGBTG 
NDIHOWFUOUHPJAHYZUGVJEYOBGZIOUNLPLNNZHFZDJCYLBKGQEWTQMXJKNYXPC 
KAPJGAGKWUCLGTFKYFASCYGTXGZXXACCNRHSXTPYLSJWIEMSABFH 

We  run  through  all  possible  60  •  263  possible  values  for  the  rotors  and  the  rotor  positions,  with  ring 
settings  equal  to  A ,  A ,  A.  We  obtain  the  “high”  values  for  the  IC  statistic  given  in  Table  8.3.  For 
the  top  300  or  so  such  high  values  we  then  run  through  all  possible  values  for  the  rings  r\  and  V2 
(note  the  third  ring  plays  no  part  in  the  process)  and  we  set  the  rotor  starting  positions  to  be 

Pi  =  pi+ri+n, 

P2  =  A  +  r2+*2, 

P3  =  Ps 


The  addition  of  the  rq  value  is  to  take  into  account  the  change  in  ring  position  from  A  to  rq.  The 
additional  value  of  ij  is  taken  from  the  set  {  —  1,  0, 1}  and  is  used  to  accommodate  issues  to  do  with 
rotor  turnovers  which  our  crude  IC  statistic  is  unable  to  pick  up. 

Running  through  all  these  possibilities  we  present  the  configurations  producing  the  highest 
values  of  IC  in  Table  8.4.  Finally,  using  our  previous  technique  for  finding  the  plugboard  settings 
given  the  rotor  settings  in  a  ciphertext  only  attack  (using  the  Sinkov  statistic),  we  determine  that 
the  actual  settings  are 


Pi 

P2 

PS 

Pi 

P2 

P3 

n 

T2 

n 

I 

II 

i — i 
i — i 
i — i 

L 

D 

c 

Q 

B 

A 

2 


See  the  references  at  the  end  of  this  chapter. 
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IC 

0.04095 

0.0409017 

0.0409017 

0.0408496 

0.040831 

0.0408087 

0.040805 

0.0407827 

0.040779 

0.0407121 

0.0406824 

0.0406675 

0.04066 

0.0406526 

0.0406415 

0.0406303 

0.0406266 

0.0406229 

0.0405969 

0.0405931 

0.0405931 


Pi 

P  2 

P  3 

I 

V 

IV 

IV 

I 

II 

IV 

V 

I 

V 

IV 

II 

IV 

I 

V 

II 

I 

V 

I 

IV 

1 — 1 

1 — 1 

1 — 1 

V 

I 

II 

1 — 1 

1 — 1 

1 — 1 

IV 

II 

II 

1 — 1 

1 — 1 

1 — 1 

V 

IV 

V 

1 — 1 

1 — 1 

1 — 1 

IV 

II 

1 — 1 

1 — 1 

1 — 1 

III 

I 

IV 

IV 

V 

II 

I 

II 

1 — 1 

1 — 1 

1 — 1 

I 

II 

IV 

V 

IV 

II 

II 

1 — 1 

1 — 1 

1 — 1 

IV 

V 

II 

1 — 1 

1 — 1 

1 — 1 

I 

1 — 1 

1 — 1 

1 — 1 

V 

II 

IV 

I 

P'l  P2  PS 

P  R  G 

NOR 
M  G  Z 
I  J  B 
X  D  A 
E  O  J 
T  Y  H 
J  H  F 
R  L  Q 

V  C  C 
K  S  D 
H  H  D 
P  L  G 
E  E  O 

V  D  C 
T  C  G 
I  I  A 
K  Q  I 
K  O  R 
K  B  O 
K  B  Q 


Table  8.3.  High  IC  values  for  ring  setting  A,  A,  A 


with  the  plugboard  given  by  the  following  eight  plugs: 

CcP,  PgF,  G  at  H, 

I  GA  J,  K  GA  L,  MciV,  OcP. 

With  these  settings  one  finds  that  the  plaintext  is  again  the  first  two  paragraphs  of  “A  Tale  of  Two 
Cities” . 


Chapter  Summary 


•  We  have  described  the  Enigma  machine  and  shown  how  poor  session  key  agreement  was 
used  to  break  into  the  German  traffic. 

•  We  have  also  seen  how  stereotypical  messages  were  successfully  used  to  attack  the  system. 

•  We  have  seen  how  the  plugboard  and  the  rotors  worked  independently  of  each  other,  which 
led  to  attackers  being  able  to  break  each  component  separately. 
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IC 

Pi  P2  P3 

Pi  P2  P3 

r\  r2  r3 

0.0447751 

I  II  III 

K  D  C 

P  B  A 

0.0444963 

I  II  III 

L  D  C 

Q  B  A 

0.0444406 

I  II  III 

J  D  C 

O  B  A 

0.0443848 

I  II  III 

K  E  D 

P  B  A 

0.0443588 

I  II  III 

K  I  D 

P  F  A 

0.0443551 

I  II  III 

K  H  D 

PEA 

0.0443476 

I  II  III 

K  F  D 

P  C  A 

0.0442807 

I  II  III 

LED 

Q  B  A 

0.0442324 

I  II  III 

J  H  D 

O  E  A 

0.0442064 

I  II  III 

K  G  D 

PDA 

0.0441357 

I  II  III 

J  G  D 

O  D  A 

0.0441097 

I  II  III 

JED 

O  B  A 

0.0441097 

I  II  III 

L  F  D 

Q  C  A 

0.0441023 

I  II  III 

L  C  C 

Q  A  A 

0.0440837 

I  II  III 

J  F  D 

OCA 

0.0440763 

I  II  III 

J  I  D 

O  F  A 

0.0440242 

I  II  III 

K  C  C 

P  A  A 

0.0439833 

I  II  III 

L  G  D 

Q  D  A 

0.0438904 

I  II  III 

L  I  D 

Q  F  A 

0.0438607 

I  II  III 

L  H  D 

Q  E  A 

• 

•  !  ! 

•  !  ! 

•  !  ! 

Table  8.4.  High  IC  values  for  different  ring  and  rotor  settings 


Further  Reading 

The  paper  by  Rejewski  presents  the  work  of  the  Polish  cryptographers  very  clearly.  The  pure 
ciphertext  only  attack  is  presented  in  the  papers  by  Gillogly  and  Williams. 

J.  Gillogly.  Ciphertext- only  cryptanalysis  of  Enigma.  Cryptologia,  19,  405-413,  1995. 

M.  Rejewski.  An  application  of  the  theory  of  permutations  in  breaking  the  Enigma  cipher.  Appli- 
cationes  Mathematicae,  16,  543-559,  1980. 

H.  Williams.  Applying  statistical  language  recognition  techniques  in  the  ciphertext- only  cryptanal¬ 
ysis  of  Enigma.  Cryptologia,  24,  4-17,  2000. 


CHAPTER  9 


Information-Theoretic  Security 


Chapter  Goals 


•  To  introduce  the  concept  of  perfect  secrecy. 

•  To  discuss  the  security  of  the  one-time  pad. 

•  To  introduce  the  concept  of  entropy. 

•  To  explain  the  notions  of  key  equivocation,  spurious  keys  and  unicity  distance. 


9.1.  Introduction 

Information  theory  is  one  of  the  foundations  of  computer  science.  In  this  chapter  we  will  examine 
its  relationship  to  cryptography.  But  we  shall  not  assume  any  prior  familiarity  with  information 
theory. 

We  first  need  to  overview  the  difference  between  information-theoretic  security  and  compu¬ 
tational  security.  Informally,  a  cryptographic  system  is  called  computationally  secure  if  the  best 
possible  algorithm  for  breaking  it  requires  N  operations,  where  N  is  such  a  large  number  that  it  is 
infeasible  to  carry  out  this  many  operations.  With  current  computing  power  we  assume  that  2 128 
operations  is  an  infeasible  number  of  operations  to  carry  out.  Hence,  a  value  of  N  larger  than  2128 
would  imply  that  the  system  is  computationally  secure.  Note  that  no  actual  system  can  be  proved 
secure  under  this  definition,  since  we  never  know  whether  there  is  a  better  algorithm  than  the  one 
known.  Hence,  in  practice  we  say  a  system  is  computationally  secure  if  the  best  known  algorithm 
for  breaking  it  requires  an  unreasonably  large  amount  of  computational  resources. 

Another  practical  approach,  related  to  computational  security,  is  to  reduce  breaking  the  system 
to  solving  some  well-studied  hard  problem.  For  example,  we  can  try  to  show  that  a  given  system 
is  secure  if  a  given  integer  N  cannot  be  factored.  Systems  of  this  form  are  often  called  provably 
secure.  However,  we  only  have  a  proof  relative  to  some  hard  problem,  and  hence  this  does  not 
provide  a  complete  guarantee  of  security. 

Essentially,  a  computationally  secure  scheme,  or  one  which  is  provably  secure,  is  only  secure 
when  we  consider  an  adversary  whose  computational  resources  are  bounded.  Even  if  the  adversary 
has  large,  but  limited,  resources,  she  still  will  not  break  the  system.  When  considering  schemes 
which  are  computationally  secure  we  need  to  be  very  clear  about  certain  issues: 

•  We  need  to  be  careful  about  the  key  sizes  etc.  If  the  key  size  is  small  then  our  adversary 
may  have  enough  computational  resources  to  break  the  system. 

•  We  need  to  keep  abreast  of  current  algorithmic  developments  and  developments  in  com¬ 
puter  hardware. 

•  At  some  point  in  the  future  we  should  expect  our  system  to  become  broken,  either  through 
an  improvement  in  computing  power  or  an  algorithmic  breakthrough. 

It  turns  out  that  most  schemes  in  use  today  are  computationally  secure,  and  so  every  chapter  in 
this  book  (except  this  one)  will  mainly  focus  on  computationally  secure  systems. 
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On  the  other  hand,  a  system  is  said  to  be  unconditionally  secure  when  we  place  no  limit  on  the 
computational  power  of  the  adversary.  In  other  words  a  system  is  unconditionally  secure  if  it  cannot 
be  broken  even  with  infinite  computing  power.  Hence,  no  matter  what  algorithmic  improvements 
are  made  or  what  improvements  in  computing  technology  occur,  an  unconditionally  secure  scheme 
will  never  be  broken.  Other  names  for  unconditional  security  you  find  in  the  literature  are  perfect 
security  or  information-theoretic  security. 

You  have  already  seen  that  the  following  systems  are  not  computationally  secure,  since  we 
already  know  how  to  break  them  with  very  limited  computing  resources: 

•  Shift  cipher, 

•  Substitution  cipher, 

•  Vigenere  cipher, 

•  Enigma  machine. 

Of  the  systems  we  shall  meet  later,  the  following  are  computationally  secure  but  are  not  uncondi¬ 
tionally  secure: 

•  DES  and  AES, 

•  RSA, 

•  ElGamal  encryption. 

However,  the  one-time  pad  which  we  shall  meet  in  this  chapter  is  unconditionally  secure,  but  only 
if  it  is  used  correctly. 


9.2.  Probability  and  Ciphers 


Before  we  can  formally  introduce  the  concept  of  unconditional  security  we  first  need  to  understand 
in  more  detail  the  role  of  probability  in  simple  symmetric  ciphers  such  as  those  discussed  in  Chapter 
7.  We  utilize  the  following  notation  for  various  spaces: 

•  Let  P  denote  the  set  of  possible  plaintexts. 

•  Let  K  denote  the  set  of  possible  keys. 

•  Let  C  denote  the  set  of  ciphertexts. 


To  each  of  these  sets  we  can  assign  a  probability  distribution,  where  we  denote  the  probabilities 
by  p(P  =  m),  p(K  =  k),  p(C  =  c).  We  denote  the  encryption  function  by  e^,  and  the  decryption 
function  by  d^.  For  example,  if  our  message  space  is  P  =  {a,  b,  c,  d}  and  the  message  a  occurs  with 
probability  1/4  then  we  write 


p{P  =  a) 


1 

4' 


We  make  the  reasonable  assumption  that  P  and  K  are  independent,  i.e.  the  user  will  not  decide  to 
encrypt  certain  messages  under  one  key  and  other  messages  under  another.  The  set  of  ciphertexts 
under  a  specific  key  k  is  defined  by 


C (k)  =  {efc(x)  :  x  G  P}. 

We  then  have  that  p(C  =  c)  is  defined  by 

(9)  p(C  =  c)  =  ^2  p(K  =  k) -p(P  =  dk(c)). 

fc:cEC(fc) 

As  an  example,  which  we  shall  use  throughout  this  section,  assume  that  we  have  only  four  messages 
P  =  {a,  b ,  c,  d}  which  occur  with  probability 

•  p(P  =  a)  =  1/4, 

•  p(P  =  b)  =  3/10, 

•  p(P  =  c)  =  3/20, 

•  p(P  =  d)  =  3/10. 

Also  suppose  we  have  three  possible  keys  given  by  K  =  {k\,  &2,  £3},  which  occur  with  probability 
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•  p(K  =  fci)  =  1/4, 

•  p(K  =  fc2)  =  1/2, 

•  p(K  =  fc3)  =  1/4. 

Now,  suppose  we  have  C  =  {1,  2,  3,4},  with  the  encryption  function  given  by  the  following  table. 


abed 

h 

k2 

k3 

3  4  2  1 

3  14  2 

4  3  12 

We  can  then  compute,  using  formula  (9) 


p{C  = 

i) 

=  p(K  = 

h) 

■p(P 

=  d)  +  p(K 

=  fc2) 

■  p(p  = 

b )  +  p(K  = 

-~h) 

■p(P  = 

c ) 

=  0.2625, 

p(C  = 

2) 

=  p(K  = 

hi) 

■p(P 

c )  +  p(K 

=  fc2) 

■  p(p  = 

d)  +  p(  K  = 

~-h) 

■p(P  = 

d) 

=  0.2625, 

p(C  = 

3) 

=  P(K  = 

hi) 

■p(P 

a)  +  p(K 

=  fc2) 

•  p(p  = 

a)  +p(K  = 

=  h) 

■p(P  = 

b) 

=  0.2625, 

P(C  = 

4) 

=  p(K  = 

hi) 

■p(P 

=  b)+p{K 

=  k2) 

■  p(p  = 

c )  +  p(K  = 

=  fc3)' 

■p(P  = 

a) 

=  0.2125. 

the  ciphertexts  produced  are 

distributed 

almost  uniformly.  For  c 

G  C 

and  m 

G  P  we  can 

compute  the  conditional  probability  p(C  =  c  \  P  —  m).  This  is  the  probability  that  c  is  the 
ciphertext  given  that  m  is  the  plaintext 


p(C  =  c 


P 


=  y  p(K  =  *)■ 

k:m=dk(c ) 


This  sum  of  probabilities  is  the  sum  over  all  keys  k  for  which  the  decryption  function  on  input  of 
c  will  output  m.  For  our  prior  example  we  can  compute  these  probabilities  as 


p(C 

p(C 


1  p 


3  P 


a) 

a) 


0,  p(C 
0.75,  p(C 


2 

4 


P 

P 


a) 

a) 


0, 

0.25, 


p(C 

p(C 


1  p 


3  P 


b) 

b) 


0.5,  p(C 
0.25,  p(C 


2 

4 


P 

P 


b) 

b) 


0, 

0.25, 


p{C 

P(C 


1  P 


3  P 


c ) 
c) 


0.25,  p(C 
0,  p(C 


2  |  P 
4  P 


c) 

c) 


0.25, 

0.5, 


p(C 

p(C 


1  p 


3  P 


d) 

d) 


0.25,  p(C 
0,  p(C 


2  |  P 
4  P 


d) 

d) 


0.75, 

0. 


However,  when  we  try  to  break  a  cipher  we  want  the  conditional  probability  the  other  way  around, 
i.e.  we  want  to  know  the  probability  of  a  given  message  occurring  given  only  the  ciphertext.  We 
can  compute  the  probability  of  m  being  the  plaintext  given  that  c  is  the  ciphertext  via 


p(P  =  m  |  C  =  c) 


p(P  =  m)  •  p(C  =  c 
p(C  =  c) 


P 
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This  conditional  probability  can  be  computed  by  anyone  who  knows  the  encryption  function  and 
the  probability  distributions  of  K  and  P.  Using  these  probabilities  one  may  be  able  to  deduce  some 
information  about  the  plaintext  once  the  ciphertext  is  known. 


b\C=l)  =  0.571, 


previous  example 

we  compute 

p(p  = 

a 

1  C  = 

1) 

=  0, 

p(p 

p(p  = 

c 

c  = 

1) 

=  0.143, 

p(p 

p(p  = 

a 

1  c  = 

2) 

=  0, 

p(p 

p(p  = 

c 

c  = 

2) 

=  0.143, 

p(p 

p(p  = 

a 

1  c  = 

3) 

=  0.714, 

p(p 

p(p  = 

c 

c  = 

3) 

=  0, 

p(p 

p(p  = 

a 

1  c  = 

4) 

=  0.294, 

p(p 

p(p  = 

c 

c  = 

4) 

=  0.352, 

p(p 

0, 

0.857, 

0.286, 

0, 


d  |  C  =  4)  =  0. 


Hence 


•  If  we  see  the  ciphertext  1  then  we  know  the  message  is  not  equal  to  a.  We  also  can  guess 
that  it  is  more  likely  to  be  b  rather  than  c  or  d. 

•  If  we  see  the  ciphertext  2  then  we  know  the  message  is  not  equal  to  a  or  5,  and  it  is  quite 
likely  that  the  message  is  equal  to  d. 

•  If  we  see  the  ciphertext  3  then  we  know  the  message  is  not  equal  to  c  or  d  and  there  is  a 
good  chance  that  it  is  equal  to  a. 

•  If  we  see  the  ciphertext  4  then  we  know  the  message  is  not  equal  to  d,  but  cannot  really 
guess  with  confidence  whether  the  message  is  a,  b  or  c. 

So  in  our  previous  example  the  ciphertext  does  reveal  a  lot  of  information  about  the  plaintext.  But 
this  is  exactly  what  we  wish  to  avoid:  We  want  the  ciphertext  to  give  no  information  about  the 
plaintext.  A  system  with  this  property,  that  the  ciphertext  reveals  nothing  about  the  plaintext,  is 
said  to  be  perfectly  secure. 

Definition  9.1  (Perfect  Secrecy).  A  cryptosystem  has  perfect  secrecy  if 

p(P  =  m  |  C  =  c)  =  p(P  =  m) 
for  all  plaintexts  m  and  all  ciphertexts  c. 

This  means  the  probability  that  the  plaintext  is  m,  given  that  we  know  the  ciphertext  is  c,  is 
the  same  as  the  probability  that  it  is  m  without  seeing  c.  In  other  words  knowing  c  reveals  no 
information  about  m.  Another  way  of  describing  perfect  secrecy  is  the  following. 

Lemma  9.2.  A  cryptosystem  has  perfect  secrecy  if  p(C  =  c  \  P  =  m)  =  p(C  =  c)  for  all  m  and  c. 
Proof.  This  follows  trivially  from  the  definition 


p(P  =  m  \  C  =  c)  = 


p(P  =  m)p(C  —  c  \  P  —  m) 


p(C  =  c) 

and  the  fact  that  perfect  secrecy  means  p(P  =  m  \  C  =  c)  =  p(P  =  m) 

The  first  result  about  perfect  security  is  as  follows. 

Lemma  9.3.  Assume  the  cryptosystem  is  perfectly  secure,  then 

#K  >  #C  >  #P, 


□ 


where 
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•  denotes  the  size  of  the  set  of  possible  keys, 

•  denotes  the  size  of  the  set  of  possible  ciphertexts, 

•  ^P  denotes  the  size  of  the  set  of  possible  plaintexts . 

Proof.  First  note  that  in  any  encryption  scheme,  we  must  have 

#C  >  #P 


since  encryption  must  be  an  injective  map;  we  have  to  be  able  to  decrypt  after  all. 

We  assume  that  every  ciphertext  can  occur,  i.e.  p(C  =  c)  >  0  for  all  c  E  C,  since  if  this  does 
not  hold  then  we  can  alter  our  definition  of  C.  Then  for  any  message  m  and  any  ciphertext  c  we 
have 


p(C 


c 


P  =  m)  =  p(C  =  c)  >  0. 


For  each  m,  this  means  that  for  all  c  there  must  be  a  key  k  such  that 


ek(m)  =  c. 


Hence,  as  required. 


□ 


We  now  come  to  the  main  theorem  on  perfectly  secure  ciphers,  due  to  Shannon.  Shannon’s  Theorem 
tells  us  exactly  which  encryption  schemes  are  perfectly  secure  and  which  are  not. 

Theorem  9.4  (Shannon).  Let  (P,  C,  K,  ejfc(-),  c4(*))  denote  a  cryptosystem  with  #P  = 

Then  the  cryptosystem  provides  perfect  secrecy  if  and  only  if 

•  Every  key  is  used  with  equal  probability  1/ffiK, 

•  For  each  m  G  P  and  c  G  C  there  is  a  unique  key  k  such  that  efc(ra)  =  c. 

Proof.  Note  the  statement  is  if  and  only  if;  hence  we  need  to  prove  it  in  both  directions.  We  first 
prove  the  only  if  part. 


Suppose  the  system  gives  perfect  secrecy.  Then  we  have  already  seen,  in  the  proof  of  Lemma  9.3, 
that  for  all  m  E  P  and  cG  C  there  is  a  key  k  such  that  efc(ra)  =  c.  Now,  since  we  have  assumed 
ff  C  =  we  have 

i.e.  there  do  not  exist  two  keys  k\  and  k 2  such  that 


ekl(m)  =  ek2(m)  =  c. 

So  for  all  m  E  P  and  c  G  C  there  is  exactly  one  k  G  K  such  that  e^(m)  =  c.  We  need  to  show  that 
every  key  is  used  with  equal  probability,  i.e.  p(K  =  k)  =  1/jfiK  for  all  k  G  K. 

Let  n  =  and  P  =  :  1  <  i  <  n},  fix  c  £  C  and  label  the  keys  k\, ...  ,kn  such  that 

eki(mi)  =  c  for  1  <  i  <  n.  We  then  have,  noting  that  due  to  perfect  secrecy  p(P  =  mi  \  C  =  c)  = 
P(P  =  mi), 


p(P  =  nij)  =  p(P  =  mi  |  C  =  c) 

p(C 


=  c 


P  =  mi)  •  p{P 


p(K 


p(C  = 

fci)  •  A-P 


c) 

--  mi) 


p(C  =  c) 


Hence  we  obtain,  for  all  1  <  i  <  n,  that  p(C  =  c)  =  p(K  =  kf).  This  says  that  the  keys  are  used 
with  equal  probability  and  hence  p(K  =  k)  =  1  /jfcK  for  all  fceK 


Now  we  need  to  prove  the  result  in  the  other  direction.  Namely,  if 

•  #K  =  #C  =  #P, 

•  Every  key  is  used  with  equal  probability  1  / 
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•  For  each  m  G  P  and  c  G  C  there  is  a  unique  key  k  such  that  efc(ra)  =  c, 
then  we  need  to  show  the  system  is  perfectly  secure,  i.e.  for  all  m  and  c  that 

p(P  =  m  |  C  =  c)  =  p(P  =  rn). 

Since  each  key  is  used  with  equal  probability,  we  have 

p(C  =  c)  =  J2p(K  =  k)  ■  p(P  =  4(c)) 

k 

=  ^Ep(p  =  A(c)). 

k 

Also,  since  for  each  m  and  c  there  is  a  unique  key  k  with  efc(ra)  =  c,  we  must  have 

EtT  =  a(c))  =  Ep(p  =  m)  =  L 

k  rn 


Hence,  p(C  =  c)  =  1/#K.  In  addition,  if  c 
So  using  Bayes’  Theorem  we  have 


e&(m)  then  p(C  =  c 


P  =  m)  =  p(K  =  k) 


p(P  =  m  |  C  =  c) 


p(P  =  rn)  •  p(C  =  c  |  P  =  m) 
p{C  =  c) 

pT  =  M  •  3EK 
1 

p(P  =  rn). 


1/#K. 


□ 


We  end  this  section  by  discussing  a  couple  of  systems  which  have  perfect  secrecy. 

9.2.1.  Modified  Shift  Cipher:  Recall  that  the  shift  cipher  is  one  in  which  we  “add”  a  given 
letter  (the  key)  to  each  letter  of  the  plaintext  to  obtain  the  ciphertext.  We  now  modify  this  cipher 
by  using  a  different  key  for  each  plaintext  letter.  For  example,  to  encrypt  the  message  HELLO 
we  choose  five  random  keys,  say  FUI AT.  We  then  add  the  key  to  the  plaintext,  modulo  26,  to 
obtain  the  ciphertext  MYTLH.  Notice  how  the  plaintext  letter  L  encrypts  to  different  letters  in  the 
ciphertext. 

When  we  use  the  shift  cipher  with  a  different  random  key  for  each  letter,  we  obtain  a  perfectly 
secure  system.  To  see  why  this  is  so,  consider  the  situation  of  encrypting  a  message  of  length  n. 
Then  the  total  number  of  keys,  ciphertexts  and  plaintexts  are  all  equal,  namely: 

#K  =  #C  =  #P  =  26n. 

In  addition  each  key  will  occur  with  equal  probability: 

p(K  =  k)  = - , 

’  26n 

and  for  each  m  and  c  there  is  a  unique  k  such  that  e^(m)  =  c.  Hence,  by  Shannon’s  Theorem  this 
modified  shift  cipher  is  perfectly  secure. 

9.2.2.  Vernam  Cipher:  The  above  modified  shift  cipher  basically  uses  addition  modulo  26.  One 
problem  with  this  is  that  in  a  computer,  or  any  electrical  device,  mod  26  arithmetic  is  hard,  but 
binary  arithmetic  is  easy.  We  are  particularly  interested  in  the  addition  operation,  which  is  denoted 
by  0  and  is  equal  to  the  logical  exclusive-or  operation. 


0 

0  1 

0 

1 

0  1 

1  0 
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In  1917  Gilbert  Vernam  patented  a  cipher  which  used  these  principles,  called  the  Vernam  cipher 
or  one-time  pad.  To  send  a  binary  string  we  need  a  key,  which  is  a  binary  string  as  long  as  the 
message.  To  encrypt  a  message  we  exclusive-or  each  bit  of  the  plaintext  with  each  bit  of  the  key 
to  produce  the  ciphertext. 

Each  key  is  only  allowed  to  be  used  once,  hence  the  term  one-time  pad.  This  means  that  key 
distribution  is  a  problem,  which  we  shall  come  back  to  again  and  again.  To  see  why  we  cannot  get 
away  with  using  a  key  twice,  consider  the  following  chosen  plaintext  attack.  We  assume  that  Alice 
always  uses  the  same  key  k  to  encrypt  a  message  to  Bob.  Eve  wishes  to  determine  this  key  and  so 
carries  out  the  following  attack: 

•  Eve  generates  m  and  asks  Alice  to  encrypt  it. 

•  Eve  obtains  c  =  m  0  k. 

•  Eve  now  computes  k  =  c  0  m. 

You  may  object  to  this  attack  since  it  requires  Alice  to  be  particularly  stupid,  in  that  she  encrypts 
a  message  for  Eve.  But  in  designing  our  cryptosystems  we  should  try  to  make  systems  which  are 
secure  even  against  stupid  users. 

Another  problem  with  using  the  same  key  twice  is  the  following.  Suppose  Eve  can  intercept 
two  messages  encrypted  with  the  same  key 

ci  =  mi  ©  k. 

C2  =  m2  ©  k. 

Eve  can  now  determine  some  partial  information  about  the  pair  of  messages  mi  and  m2  since  she 
can  compute 

ci  0  C2  =  (mi  0  k)  0  (m2  0  k)  =  mi  0  m2. 

Despite  the  problems  associated  with  key  distribution,  the  one-time  pad  has  been  used  in  the  past 
in  military  and  diplomatic  contexts. 


9.3.  Entropy 

If  every  message  we  send  requires  a  key  as  long  as  the  message,  and  we  never  encrypt  two  messages 
with  the  same  key,  then  encryption  will  not  be  very  useful  in  everyday  applications  such  as  Internet 
transactions.  This  is  because  getting  the  key  from  one  person  to  another  will  be  an  impractical 
task.  After  all,  one  cannot  encrypt  it  since  that  would  require  another  key.  This  problem  is  called 
the  key  distribution  problem. 

To  simplify  the  key  distribution  problem  we  need  to  turn  from  perfectly  secure  encryption  algo¬ 
rithms  to  ones  which  are,  we  hope,  computationally  secure.  This  is  the  goal  of  modern  cryptography, 
where  one  aims  to  build  systems  such  that 

•  one  key  can  be  used  many  times, 

•  a  small  key  can  encrypt  a  long  message. 

Such  systems  will  not  be  unconditionally  secure,  by  Shannon’s  Theorem,  and  so  must  be  at  best 
only  computationally  secure. 

We  now  need  to  develop  the  information  theory  needed  to  deal  with  these  computationally 
secure  systems.  Again  the  main  results  are  due  to  Shannon  in  the  late  1940s.  In  particular,  we 
shall  use  Shannon’s  idea  of  using  entropy  as  a  way  of  measuring  information. 

The  word  entropy  is  another  name  for  uncertainty,  and  the  basic  tenet  of  information  theory  is 
that  uncertainty  and  information  are  essentially  the  same  thing.  This  takes  some  getting  used  to, 
but  consider  that  if  you  are  uncertain  what  something  means  then  revealing  the  meaning  gives  you 
information.  As  a  cryptographic  application,  suppose  you  want  to  determine  the  information  in  a 
ciphertext,  in  other  words  you  want  to  know  what  the  ciphertext’s  true  meaning  is.  The  entropy 
in  the  ciphertext  is  the  amount  of  uncertainty  you  have  about  the  underlying  plaintext.  If  X  is 
a  random  variable,  the  amount  of  entropy  (in  bits)  associated  with  X  is  denoted  by  H(X).  We 
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shall  define  this  quantity  formally  in  a  second.  First,  let  us  look  at  a  simple  example  to  help  clarify 
ideas. 

Suppose  X  is  the  answer  to  some  question,  i.e.  Yes  or  No.  If  you  know  I  will  always  say  Yes, 
then  my  answer  gives  you  no  information.  So  the  information  contained  in  X  should  be  zero,  i.e. 
H(X)  =  0.  There  is  no  uncertainty  about  what  I  will  say,  hence  no  information  is  given  by  me 
saying  it,  hence  there  is  no  entropy.  On  the  other  hand,  if  you  have  no  idea  what  I  will  say  and  I 
reply  Yes  with  equal  probability  to  replying  No  then  I  am  revealing  one  bit  of  information.  Hence, 
we  should  have  H  (X)  =  1. 

Note  that  the  entropy  does  not  depend  on  the  length  of  the  actual  message;  in  the  above  case 
we  have  a  message  of  length  at  most  three  letters  but  the  amount  of  information  is  at  most  one 
bit.  We  can  now  define  formally  the  notion  of  entropy. 

Definition  9.5  (Entropy).  Let  X  be  a  random  variable  which  takes  a  finite  set  of  values  Xi,  with 
1  <  i  <  n,  and  has  probability  distribution  pi  =  p(X  =  xf),  where  we  use  the  convention  that  if 
Pi  —  0  then  pi  log2  Pi  =  0.  The  entropy  of  X  is  defined  to  be 


n 


H(X)  =  -J2p>-^g2p,. 

i=  1 

Let  us  return  to  our  Yes  or  No  question  above  and  show  that  this  definition  of  entropy  coincides 
with  our  intuition.  Recall  that  X  is  the  answer  to  some  question  with  responses  Yes  or  No.  If  you 
know  I  will  always  say  Yes  then  p\  =  1  and  p2  =  0.  We  compute  H(X)  =  —  1  •  log2  1  —  0  •  log2  0  =  0. 
Hence,  my  answer  reveals  no  information  to  you.  If  you  have  no  idea  what  I  will  say  and  I  reply 
Yes  with  equal  probability  to  replying  No  then  p\  —  p2  =  1/2.  We  now  compute 

=  2|i_!2|i  =  i. 

Hence,  my  answer  reveals  one  bit  of  information  to  you. 


9.3.1.  Properties  of  Entropy:  A  number  of  elementary  properties  of  entropy  follow  immediately 
from  the  definition. 

•  We  always  have  H  (X)  >  0. 

•  The  only  way  to  obtain  H(X)  =  0  is  if  for  some  i  we  have  Pi  =  1  and  pj  =  0  when  i  j. 

•  If  pi  =  1/n  for  all  i  then  H (X)  =  log2  n. 

Another  way  of  looking  at  entropy  is  that  it  measures  how  much  one  can  compress  information.  If 
I  send  a  single  ASCII  character  to  signal  Yes  or  No,  for  example  I  could  simply  send  Y  or  A,  I  am 
actually  sending  eight  bits  of  data,  but  I  am  only  sending  one  bit  of  information.  If  I  wanted  to  I 
could  compress  the  data  down  to  1  / 8th  of  its  original  size.  Hence,  naively  if  a  message  of  length  n 
can  be  compressed  to  a  proportion  e  of  its  original  size  then  it  contains  e  •  n  bits  of  information  in 
it. 

We  now  derive  an  upper  bound  for  the  entropy  of  a  random  variable,  to  go  with  our  lower 
bound  of  H{X)  >  0.  To  do  this  we  will  need  the  following  special  case  of  Jensen’s  inequality. 

Theorem  9.6  (Jensen’s  Inequality).  Suppose 

n 

Yai  = 1 

i—  1 

with  ai  >  0  for  1  <  i  <  n.  Then,  for  Xi  >  0, 

n  /  n 

•  k>g2  <  log  2N>-^ 
i= 1  Vi=l 

Equality  occurs  if  and  only  if  x\  —  X2  =  . . .  =  xn. 
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Using  this  we  can  now  prove  the  following  theorem. 

Theorem  9.7.  If  X  is  a  random  variable  which  takes  n  possible  values  then 

0  <  H (X)  <  log2  n. 

The  lower  bound  is  obtained  if  one  value  occurs  with  probability  one ,  and  the  upper  bound  is  obtained 
if  all  values  are  equally  likely. 


Proof.  We  have  already  discussed  the  facts  about  the  lower  bound  so  we  will  concentrate  on  the 
statements  about  the  upper  bound.  The  hypothesis  is  that  X  is  a  random  variable  with  probability 
distribution  pi,...,pn,  with  pi  >  0  for  all  i.  One  can  then  deduce  the  following  sequence  of 
inequalities: 


n 

H(X)  =  ~TPi 

i=  1 


n 


=  5>-  loS2 

i=  1 


1 

Pi 


< 


=  log2  n. 


by  Jensen’s  inequality 


To  obtain  equality,  we  require  equality  when  we  apply  Jensen’s  inequality.  But  this  will  only  occur 
when  pi  =  1/n  for  all  z,  in  other  words,  when  all  values  of  X  are  equally  likely.  □ 


9.3.2.  Joint  and  Conditional  Entropy:  The  basics  of  the  theory  of  entropy  closely  match  those 
of  the  theory  of  probability.  For  example,  if  X  and  Y  are  random  variables  then  we  define  the  joint 
probability  distribution  as 

nj  =  p(X  =  Xi  and  Y  =  yj) 

for  1  <  i  <  n  and  1  <  j  <  m.  The  joint  entropy  is  then  obviously  defined  as 

n  m 

H(X,Y)  =  -'£'£rid-\og2rlJ. 

i= 1  3  =  1 

You  should  think  of  the  joint  entropy  H (X,  Y)  as  the  total  amount  of  information  contained  in  one 
observation  of  (x,  y)  G  X  x  Y .  We  then  obtain  the  inequality 

H(X,Y)  <H(X)  +  H(Y) 

with  equality  if  and  only  if  X  and  Y  are  independent.  We  leave  the  proof  of  this  as  an  exercise. 

Just  as  with  probability  theory,  where  one  has  the  linked  concepts  of  joint  probability  and 
conditional  probability,  so  the  concept  of  joint  entropy  is  linked  to  the  concept  of  conditional 
entropy.  This  is  important  to  understand,  since  conditional  entropy  is  the  main  tool  we  shall  use 
in  understanding  non-perfect  ciphers  in  the  rest  of  this  chapter.  Let  X  and  Y  be  two  random 
variables.  Recall  we  defined  the  conditional  probability  distribution  as 


p(X  =  x  \  Y  =  y)  =  Probability  that  X  —  x  given  Y  —  y. 

The  entropy  of  X  given  an  observation  of  Y  —  y  is  then  defined  in  the  obvious  way  by 

H(X  \  Y  =  y)  =  ~^2p(X  =  x  \Y  =  y)  ■  log2 p(X  =  x  \Y  =  y). 


X 
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Given  this,  we  define  the  conditional  entropy  of  X  given  Y  as 

H(X  I  Y)  =  Y,p(Y  =  y)-H(X  |  Y  =  y) 


y 


P(Y  =  y )  •  p(X  =  x  \  Y  =  y)  ■  log2  p(X  =  x  \  Y  =  y) 


x  y 


This  is  the  amount  of  uncertainty  about  X  that  is  left  after  revealing  a  value  of  Y .  The  conditional 
and  joint  entropy  are  linked  by  the  following  formula 

H(X,  Y)  =  H(Y)  +  H(X  |  Y) 

and  we  have  the  following  upper  bound 


H(X  |  Y)  <  H(X) 

with  equality  if  and  only  if  X  and  Y  are  independent.  Again,  we  leave  the  proof  of  these  statements 
as  an  exercise. 


9.3.3.  Application  to  Ciphers:  Now  turning  to  cryptography  again,  we  have  some  trivial  state¬ 
ments  relating  the  entropy  of  P,  K  and  C. 

•  H(P  \K,C)  =  0: 

If  you  know  the  ciphertext  and  the  key  then  you  know  the  plaintext.  This  must  hold  since 
otherwise  decryption  will  not  work  correctly. 

•  H{C  |  P,  K)  =  0: 

If  you  know  the  plaintext  and  the  key  then  you  know  the  ciphertext.  This  holds  for  all 
ciphers  we  have  seen  so  far,  and  holds  for  all  the  block  ciphers  we  shall  see  in  later  chapters. 
However,  for  modern  encryption  schemes  we  do  not  have  this  last  property  when  they  are 
used  correctly,  as  many  ciphertexts  can  correspond  to  the  same  plaintext. 

In  addition  we  have  the  following  identities 

H(K,  P,  C)  =  P(P,  K)  +  H[C  |  P,  K)  as  H(X,  Y)  =  H(Y)  +  H(X  |  Y) 

=  P(P,  K)  as  H[C  |  P,  K)  =  0 

=  H(K)  +  H (P)  as  K  and  P  are  independent 

and 


H(K,  P,  C)  =  H(K,  C)  +  H(P  |  P,  C) 
=  H(K,C) 


as  H(X,  Y)  =  H(Y)  +  H(X  |  Y) 

as  H(P  |  K,C)  =  0. 


Hence,  we  obtain 


H(K,  C)  =  H{K)  +  H(P). 

This  last  equality  is  important  since  it  is  related  to  the  conditional  entropy  H(K  \  C),  which  is 
called  the  key  equivocation.  The  key  equivocation  is  the  amount  of  uncertainty  left  about  the  key 
after  one  ciphertext  is  revealed.  Recall  that  our  goal  is  to  determine  the  key  given  the  ciphertext. 
Putting  two  of  our  prior  equalities  together  we  find 


(10)  H[K  |  C)  =  H[K ,  C)  -  H(C)  =  H{K)  +  H{P)  -  H(C). 


In  other  words,  the  uncertainty  about  the  key  left  after  we  reveal  a  ciphertext  is  equal  to  the 
uncertainty  in  the  plaintext  and  the  key  minus  the  uncertainty  in  the  ciphertext. 

Let  us  return  to  our  baby  cryptosystem  considered  in  the  previous  section.  Recall  we  had  the 
probability  spaces 

P  =  {a,  6,  c,  d},  K  =  {&q,  k%}  and  C  =  {1,  2,  3, 4}, 

with  the  associated  probabilities: 

•  p(P  =  a)  =  0.25,  p(P  =  b)  =  p(P  =  d)  =  0.3  and  p(P  =  c)  =  0.15, 
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•  p(K  =  ki)  =  p(K  =  ks)  =  0.25  and  p(K  =  ^2)  =  0.5, 

•  p(C  =  1)  =  p(C  =  2)  =  p(C  =  3)  =  0.2625  and  p(C  =  4)  =  0.2125. 

We  can  then  calculate  the  relevant  entropies  as: 

H(P)  «  1.9527, 

H(K)  «  1.5, 

#(C)  «  1.9944. 

Hence 

i7(K  |  C)  «  1.9527  +  1.5  -  1.9944  ps  1.4583. 

So  around  one  and  a  half  bits  of  information  about  the  key  are  left  to  be  found,  on  average,  after 
a  single  ciphertext  is  observed.  This  explains  why  the  system  leaks  information,  and  shows  that  it 
cannot  be  secure.  After  all  there  are  only  1.5  bits  of  uncertainty  about  the  key  to  start  with;  one 
ciphertext  leaves  us  with  1.4593  bits  of  uncertainty.  Hence,  1.5  —  1.4593  =  0.042  bits  of  information 
about  the  key  are  revealed  by  a  single  ciphertext,  or  equivalently  three  percent  of  the  key. 

9.4.  Spurious  Keys  and  Unicity  Distance 

In  our  baby  example  above,  information  about  the  key  is  leaked  by  an  individual  ciphertext,  since 
knowing  the  ciphertext  rules  out  a  certain  subset  of  the  keys.  Of  the  remaining  possible  keys,  only 
one  is  correct.  The  remaining  possible,  but  incorrect,  keys  are  called  the  spurious  keys. 

Consider  the  (unmodified)  shift  cipher,  i.e.  where  the  same  key  is  used  for  each  letter.  Suppose 
the  ciphertext  is  WNAJW,  and  suppose  we  know  that  the  plaintext  is  an  English  word.  The  only 
“meaningful”  plaintexts  are  RIVER  and  ARENA,  which  correspond  to  the  two  possible  keys  F  and 
W.  One  of  these  keys  is  the  correct  one  and  one  is  spurious. 

We  can  now  explain  why  it  was  easy  to  break  the  substitution  cipher  in  terms  of  a  concept 
called  the  unicity  distance  of  the  cipher.  We  shall  explain  this  relationship  in  more  detail,  but  we 
first  need  to  understand  the  underlying  plaintext  in  more  detail.  The  plaintext  in  many  computer 
communications  can  be  considered  as  a  random  bit  string.  But  often  this  is  not  so.  Sometimes 
one  is  encrypting  an  image  or  sometimes  one  is  encrypting  plain  English  text.  In  our  discussion  we 
shall  consider  the  case  when  the  underlying  plaintext  is  taken  from  English,  as  in  the  substitution 
cipher.  Such  a  language  is  called  a  natural  language  to  distinguish  it  from  the  bit  streams  used  by 
computers  to  communicate. 

We  first  wish  to  define  the  entropy  (or  information)  per  letter  Hl  of  a  natural  language  such 
as  English.  Note  that  a  random  string  of  alphabetic  characters  would  have  entropy 

log2  26  ~  4.70. 

So  we  have  H \  <  4.70.  If  we  let  P  denote  the  random  variable  of  letters  in  the  English  language 
then  we  have 


p(P  =  a)  =  0.082,  . . . ,  p(P  =  e)  =  0.127,  . . . ,  p(P  —  z)  —  0.001. 

We  can  then  compute 

Hl  <  H(P)  ~  4.14. 

Hence,  instead  of  4.7  bits  of  information  per  letter,  if  we  only  examine  the  letter  frequencies  we 
conclude  that  English  conveys  around  4.14  bits  of  information  per  letter. 

But  this  is  a  gross  overestimate,  since  letters  are  not  independent.  For  example  Q  is  almost 
always  followed  by  U  and  the  bigram  TH  is  likely  to  be  very  common.  One  would  suspect  that  a 
better  statistic  for  the  amount  of  entropy  per  letter  could  be  obtained  by  looking  at  the  distribution 
of  bigrams.  Hence,  we  let  P 2  denote  the  random  variable  of  bigrams.  If  we  let  p(P  —  i^P'  —  j ) 
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denote  the  random  variable  which  is  assigned  the  probability  that  the  bigram  “f?”  appears,  then 
we  define 

H (P2)  =  -  =  hp'  =  j)  ■  !og P(P  =  i,P'  =  j). 

h3 

A  number  of  people  have  computed  values  of  H(P 2)  and  it  is  commonly  accepted  to  be  given  by 


H(P 2)  «  7.12. 


We  want  the  entropy  per  letter  so  we  compute 

Hl  <  H(P2)/2  ss  3.56. 


But  again  this  is  an  overestimate,  since  we  have  not  taken  into  account  that  the  most  common 
trigram  is  THE.  Hence,  we  can  also  look  at  P3  and  compute  H(P3)/ 3.  This  will  also  be  an 
overestimate,  and  so  on,...  This  leads  us  to  the  following  definition. 

Definition  9.8.  The  entropy  of  the  natural  language  L  is  defined  to  be 


lim 

n — >oo 


H(Pn) 

n 


The  exact  value  of  Hl  is  hard  to  compute  exactly  but  we  can  approximate  it.  In  fact  one  has,  by 
experiment,  that  for  English 

1.0  <Hl<  1.5. 


So  each  letter  in  English 

•  requires  5  =  |~log2(26)~|  bits  of  data  to  represent  it, 

•  only  gives  at  most  1.5  bits  of  information. 

This  shows  that  English  contains  a  high  degree  of  redundancy,  in  that  there  is  far  less  information 
conveyed  in  an  English  sentence  than  the  amount  of  data  needed  to  represent  the  sentence.  One 
can  see  this  from  the  following,  which  you  can  still  hopefully  read  (just)  even  though  I  have  deleted 
two  out  of  every  four  letters, 

On**  up**  a  t**e  t**re  **s  a  **rl  **11**  Sn**  Wh**e. 

The  redundancy  of  a  language  is  defined  by 


1 


Hl 

log2  #P 


5 


and  it  expresses  the  percentage  of  text  in  the  language  which  can  be  removed  (in  principle)  without 
affecting  the  overall  meaning.  If  we  take  Hl  ~  1.25  then  the  redundancy  of  English  is 


Rl  ~  1 


1.25 
log2  26 


0.75. 


So  this  means  that  we  should  be  able  to  compress  an  English  text  hie  of  around  10  MB  down  to 
2.5  MB. 


9.4.1.  Redundancy  and  Ciphertexts:  We  now  return  to  a  general  cipher  and  suppose  c  E  Cn, 
i.e.  c  is  a  ciphertext  consisting  of  n  characters.  We  define  K(c)  to  be  the  set  of  keys  which  produce 
a  “meaningful”  decryption  of  c.  Then,  clearly  #K(c)  —  1  is  the  number  of  spurious  keys  given  c. 
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The  average  number  of  spurious  keys  is  defined  to  be  sn,  where 

=  E  P(C  =  C)  •  (#K(C)  -  X) 

cG  Cn 

=  y  p(°  = c)  •  -  E  p(° 

cGCn  cGCn 


=  c) 


=  (  Y,  #K(c)-p(C  =  c)]  -1. 
\cecn  ) 


Now  if  n  is  sufficiently  large  and  we  obtain 

log2(s„  +  1)  =  log2  |  Y  #K(c)  ■  p(C  =  c ) 

VceCn 

>  E  P(C  =  C)  '  l0g2  #K(c) 

cGCn 

>  E  = c)  •  I c) 

c€Cn 

=  H(K  |  Cn) 

=  H(K)  +  H(Pn)  -  H{Cn) 

«  H(K)  +  n  ■  Hl  —  H(Cn) 

=  H(K)  -  H(Cn) 

+  n  ■  (1  -  Rl)  •  log2  #P 

>  H(K)  ~  n  ■  log2  #C 

+  n-( 1  -  Rl)  ■  log2  #P 
=  H(K)  —  n  ■  Rl  ■  log2  #P 

So,  if  n  is  sufficiently  large  and  ^P  =  then 

#K 


by  Jensen’s  inequality 


by  definition 
equation  (10) 
if  n  is  very  large 

by  definition  of  Rl 

as  H(Cn )  <  n  •  log2  #C 
as  t^P  = 


$n  P 


1. 


As  an  attacker  we  would  like  the  number  of  spurious  keys  to  become  zero,  and  it  is  clear  that  as 
we  take  longer  and  longer  ciphertexts  then  the  number  of  spurious  keys  must  go  down. 

The  unicity  distance  no  of  a  cipher  is  the  value  of  n  for  which  the  expected  number  of  spurious 
keys  becomes  zero.  In  other  words  this  is  the  average  amount  of  ciphertext  needed  before  an 
attacker  can  determine  the  key,  assuming  the  attacker  has  infinite  computing  power.  For  a  perfect 
cipher  we  have  no  =  oo,  but  for  other  ciphers  the  value  of  no  can  be  alarmingly  small.  We  can 
obtain  an  estimate  of  no  by  setting  sn  =  0  in 

#K 


$n  P 


-^pn-RL 


-  1 


to  obtain 


n0 


log2  #K 
Rl  ■  log2  #P 


#P  =  26, 

#K  =  26!  «  4  •  1026 


In  the  substitution  cipher  we  have 
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and  using  our  value  of  Rl  =  0.75  for  English  we  can  approximate  the  unicity  distance  as 

88.4 


n0 


25. 


0.75  x  4.7 

So  we  require  on  average  only  25  ciphertext  characters  before  we  can  break  the  substitution  cipher, 
again  assuming  infinite  computing  power.  In  any  case  after  25  characters  we  expect  a  unique  valid 
decryption. 

Now  assume  we  have  a  modern  cipher  which  encrypts  bit  strings  using  keys  of  bit  length  l.  We 
have 


#P  =  2, 

#K  =  2'. 


Again  we  assume  Rl  =  0.75,  which  is  an  underestimate  since  we  now  need  to  encode  English  into 
a  computer  communications  medium  such  as  ASCII.  Then  the  unicity  distance  is 


no  ~ 


l 

075 


44 

~3~m 


Now  assume  instead  of  transmitting  the  plain  ASCII  we  compress  it  hrst.  If  we  assume  a  perfect 
compression  algorithm  then  the  plaintext  will  have  no  redundancy  and  so  Rl  ~  0.  In  which  case 
the  unicity  distance  is 

l 

no  ~  -  =  oo. 

0 

So  you  may  ask  whether  modern  ciphers  encrypt  plaintexts  with  no  redundancy?  The  answer  is 
no;  even  if  one  compresses  the  data,  a  modern  cipher  often  adds  some  redundancy  to  the  plaintext 
before  encryption.  The  reason  is  that  we  have  only  considered  passive  attacks,  i.e.  an  attacker 
has  been  only  allowed  to  examine  ciphertexts  and  from  these  ciphertexts  the  attacker’s  goal  is  to 
determine  the  key.  There  are  other  types  of  attack  called  active  attacks;  in  these  an  attacker  is 
allowed  to  generate  plaintexts  or  ciphertexts  of  her  choosing  and  ask  the  key  holder  to  encrypt 
or  decrypt  them,  the  two  variants  being  called  a  chosen  plaintext  attack  and  a  chosen  ciphertext 
attack  respectively.  In  public  key  systems  that  we  shall  see  later,  chosen  plaintext  attacks  cannot 
be  stopped  since  anyone  is  allowed  to  encrypt  anything. 

We  would  like  to  stop  chosen  ciphertext  attacks  for  ah  types  of  cipher.  The  current  wisdom 
for  encryption  algorithms  is  to  make  the  cipher  add  some  redundancy  to  the  plaintext  before  it 
is  encrypted.  In  this  way  it  is  hard  for  an  attacker  to  produce  a  ciphertext  which  has  a  valid 
decryption.  The  philosophy  is  that  it  is  then  hard  for  an  attacker  to  mount  a  chosen  ciphertext 
attack,  since  it  will  be  hard  for  an  attacker  to  choose  a  valid  ciphertext  for  a  decryption  query.  We 
shah  discuss  this  more  in  later  chapters. 


Chapter  Summary 

•  A  cryptographic  system  for  which  knowing  the  ciphertext  reveals  no  more  information 
than  if  you  did  not  know  the  ciphertext  is  called  a  perfectly  secure  system. 

•  Perfectly  secure  systems  exist,  but  they  require  keys  as  long  as  the  message  and  a  different 
key  to  be  used  with  each  new  encryption.  Hence,  perfectly  secure  systems  are  not  very 
practical. 

•  Information  and  uncertainty  are  essentially  the  same  thing. 

•  The  amount  of  uncertainty  in  a  random  variable  is  measured  by  its  entropy. 

•  An  attacker  really  wants,  given  the  ciphertext,  to  determine  some  information  about  the 
plaintext. 
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•  The  equation  H(K  \  C)  =  H(K)  +  H(P)  —  H(C )  allows  us  to  estimate  how  much  uncer¬ 
tainty  remains  about  the  key  after  one  observes  a  single  ciphertext. 

•  The  natural  redundancy  of  English  means  that  a  naive  cipher  does  not  need  to  produce  a 
lot  of  ciphertext  before  the  underlying  plaintext  can  be  discovered. 


Further  Reading 

Our  discussion  of  Shannon’s  theory  has  closely  followed  the  treatment  in  the  book  by  Stinson. 
Another  possible  source  of  information  is  the  book  by  Welsh.  A  general  introduction  to  informa¬ 
tion  theory,  including  its  application  to  coding  theory,  is  in  the  book  by  van  der  Lubbe. 

J.C.A.  van  der  Lubbe.  Information  Theory.  Cambridge  University  Press,  1997. 

D.  Stinson.  Cryptography:  Theory  and  Practice.  Third  Edition.  CRC  Press,  2005. 

D.  Welsh.  Codes  and  Cryptography.  Oxford  University  Press,  1988. 


CHAPTER  10 


Historical  Stream  Ciphers 


Chapter  Goals 


•  To  introduce  the  general  model  of  symmetric  ciphers. 

•  To  explain  the  relation  between  stream  ciphers  and  the  Vernam  cipher. 

•  To  examine  the  working  and  breaking  of  the  Lorenz  cipher  in  detail. 

10.1.  Introduction  to  Symmetric  Ciphers 

A  symmetric  cipher  works  using  the  following  two  transformations 

c  =  ek(m), 

to  =  dk(c) 

where 

•  m  is  the  plaintext, 

•  e  is  the  encryption  function, 

•  d  is  the  decryption  function, 

•  /c  is  the  secret  key, 

•  c  is  the  ciphertext. 

It  is  desirable  that  both  the  encryption  and  decryption  functions  be  public  knowledge  and  so  the 
secrecy  of  the  message,  given  the  ciphertext,  depends  totally  on  the  secrecy  of  the  secret  key  k. 
Although  this  well-established  principle,  called  Kerckhoffs’  principle,  has  been  known  since  the  mid- 
1800s  some  companies  still  ignore  it  and  choose  to  deploy  secret  proprietary  encryption  schemes 
which  usually  turn  out  to  be  insecure  as  soon  as  someone  leaks  the  details  of  the  algorithms.  The 
best  schemes  will  be  the  ones  which  have  been  studied  by  many  people  for  a  very  long  time  and 
which  have  been  found  to  remain  secure.  A  scheme  which  is  a  commercial  secret  cannot  be  studied 
by  anyone  outside  the  company. 

The  above  set-up  is  called  a  symmetric  key  system  since  both  parties  need  access  to  the  secret 
key.  Sometimes  symmetric  key  cryptography  is  implemented  using  two  keys,  one  for  encryption 
and  one  for  decryption.  However,  if  this  is  the  case  we  assume  that,  given  the  encryption  key,  it  is 
easy  to  compute  the  decryption  key  (and  vice  versa).  Later  we  shall  meet  public  key  cryptography 
where  only  one  key  is  kept  secret,  called  the  private  key;  the  other  key,  called  the  public  key,  is 
allowed  to  be  published  in  the  clear.  In  this  situation  it  is  assumed  to  be  computationally  infeasible 
for  someone  to  compute  the  private  key  given  the  public  key. 

Returning  to  symmetric  cryptography,  a  moment’s  thought  reveals  that  the  number  of  possible 
keys  must  be  very  large.  This  is  because  in  designing  a  cipher  we  assume  the  worst-case  scenario 
and  give  the  attacker  the  benefit  of 

•  full  knowledge  of  the  encryption/decryption  algorithm, 

•  a  number  of  plaintext /ciphertext  pairs  associated  with  the  target  key  k. 
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If  the  number  of  possible  keys  is  small  then  an  attacker  can  break  the  system  using  an  exhaustive 
search.  The  attacker  encrypts  one  of  the  given  plaintexts  under  all  possible  keys  and  determines 
which  key  produces  the  given  ciphertext.  Hence,  the  key  space  needs  to  be  large  enough  to  avoid 
such  an  attack.  It  is  commonly  assumed  that  a  computation  taking  2 128  steps  will  be  infeasible  for 
a  number  of  years  to  come,  hence  the  key  space  size  should  be  at  least  128  bits  to  avoid  exhaustive 
search. 

The  cipher  designer  must  play  two  roles,  that  of  someone  trying  to  break  a  cipher,  as  well  as 
someone  trying  to  create  one.  These  days,  although  there  is  a  lot  of  theory  behind  the  design  of 
ciphers,  we  still  rely  on  symmetric  ciphers  which  are  just  believed  to  be  strong,  rather  than  ones 
for  which  we  know  a  reason  why  they  are  strong.  All  this  means  is  that  the  best  attempts  of 
the  most  experienced  cryptanalysts  cannot  break  them.  This  should  be  compared  with  public  key 
ciphers  and  modes  of  operations  of  block  ciphers,  where  there  is  now  a  theory  which  allows  us  to 
reason  about  how  strong  a  given  cipher  is  (given  some  explicit  computational  assumption  on  the 
underlying  primitive). 

Figure  10.1  describes  a  simple  model  for  enciphering  bits  which,  although  simple,  is  quite 
suited  to  practical  implementations.  The  idea  of  this  model  is  to  apply  a  reversible  operation  to 


Plaintext 


Encryption 


Ciphertext 


Decryption 


Plaintext 


Figure  10.1.  Simple  model  for  enciphering  bits 

the  plaintext  to  produce  the  ciphertext,  namely  combining  the  plaintext  with  a  “random  stream”. 
The  recipient  can  recreate  the  original  plaintext  by  applying  the  inverse  operation,  in  this  case  by 
combining  the  ciphertext  with  the  same  random  stream. 

This  is  particularly  efficient  since  we  can  use  the  simplest  operation  available  on  a  computer, 
namely  exclusive-or  0.  We  saw  in  Chapter  9  that  if  the  key  is  different  for  every  message  and  the 
key  is  as  long  as  the  message,  then  such  a  system  can  be  shown  to  be  perfectly  secure,  namely  we 
have  the  one-time  pad.  However,  the  one-time  pad  is  not  practical  in  many  situations. 

•  We  would  like  to  use  a  short  key  to  encrypt  a  long  message. 

•  We  would  like  to  reuse  keys. 

Modern  symmetric  ciphers  allow  both  of  these  possibilities,  but  thereby  forfeit  the  perfect  secrecy 
property.  Such  a  trade-off  is  worthwhile  because  using  a  one-time  pad  produces  horrendous  key 
distribution  problems.  We  shall  see  that  key  distribution  is  still  problematic  even  for  short,  reusable 
keys. 

There  are  a  number  of  ways  to  attack  a  bulk  cipher,  some  of  which  we  outline  below.  We  divide 
our  discussion  into  passive  and  active  attacks;  a  passive  attack  is  generally  easier  to  mount  than 
an  active  attack. 

•  Passive  Attacks:  Here  the  adversary  is  only  allowed  to  listen  to  encrypted  messages. 
Then  she  attempts  to  break  the  cryptosystem  by  either  recovering  the  key  or  determining 
some  secret  that  the  communicating  parties  did  not  want  leaked.  One  common  form  of 
passive  attack  is  that  of  traffic  analysis,  a  technique  borrowed  from  armies  in  World  War 
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I,  where  a  sudden  increase  in  radio  traffic  at  a  certain  point  on  the  Western  Front  would 
signal  an  imminent  offensive. 

•  Active  Attacks:  Here  the  adversary  is  allowed  to  insert,  delete  or  replay  messages  be¬ 
tween  the  two  communicating  parties.  A  general  requirement  is  that  an  undetected  in¬ 
sertion  attack  should  require  the  breaking  of  the  cipher,  whilst  the  cipher  needs  to  allow 
detection  and  recovery  from  deletion  or  replay  attacks. 

Bulk  symmetric  ciphers  essentially  come  in  two  variants:  stream  ciphers,  which  operate  on  one 
data  item  (bit/letter)  at  a  time,  and  block  ciphers,  which  operate  on  data  in  blocks  of  items  (e.g. 
64  bits)  at  a  time,  tn  this  chapter  we  look  at  historical  stream  ciphers,  leaving  modern  stream 
ciphers  until  Chapter  12  and  modern  block  ciphers  until  Chapter  13. 

10.2.  Stream  Cipher  Basics 

Figure  10.2  gives  a  simple  explanation  of  a  stream  cipher.  This  is  very  similar  to  our  previous 
simple  model,  except  the  random  bit  stream  is  now  produced  from  a  short  secret  key  using  a  public 
algorithm,  called  the  keystream  generator. 


Plaintext 


110010101 


Secret  key 


Ciphertext 

011011011* 


Figure  10.2.  Stream  ciphers 


Thus  we  have  c*  =  mi  ®  k{  where 

•  mo,  mi, . . .  are  the  plaintext  bits, 

•  &o,  ki, . . .  are  the  keystream  bits, 

•  co,  ci, . . .  are  the  ciphertext  bits. 

This  means 

rrii  =  Ci  0  ki 

i.e.  decryption  is  the  same  operation  as  encryption. 

Stream  ciphers  such  as  that  described  above  are  simple  and  fast  to  implement.  They  allow  very 
fast  encryption  of  large  amounts  of  data,  so  they  are  suited  to  real-time  audio  and  video  signals. 
In  addition  there  is  no  error  propagation;  if  a  single  bit  of  ciphertext  gets  mangled  during  transit 
(due  to  an  attacker  or  a  poor  radio  signal)  then  only  one  bit  of  the  decrypted  plaintext  will  be 
affected.  They  are  very  similar  to  the  Vernam  cipher  mentioned  earlier,  except  now  the  keystream 
is  only  pseudo-random  as  opposed  to  truly  random.  Thus  whilst  similar  to  the  Vernam  cipher  they 
are  not  perfectly  secure. 

Just  like  the  Vernam  cipher,  stream  ciphers  suffer  from  the  following  problem:  the  same  key 
used  twice  gives  the  same  keystream,  which  can  reveal  relationships  between  messages.  For  example 
suppose  mi  and  m2  were  encrypted  under  the  same  key  k,  then  an  adversary  could  work  out  the 
exclusive-or  of  the  two  plaintexts  without  knowing  what  the  plaintexts  were 

ci  0  C2  =  (mi  ©  k)  0  (m2  0  k)  =  0  m2. 

Hence,  there  is  a  need  to  change  keys  frequently  either  on  a  per  message  or  on  a  per  session  basis. 
This  results  in  difficult  key  management  and  distribution  challenges,  which,  as  we  shall  see  later,  can 
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be  addressed  using  public  key  cryptography.  A  typical  strategy  is  to  use  public  key  cryptography 
to  determine  session  or  message  keys,  and  then  to  rapidly  encrypt  the  actual  data  using  either  a 
stream  or  block  cipher. 

The  keystream  produced  by  the  keystream  generator  above  needs  to  satisfy  a  number  of  prop¬ 
erties  for  the  stream  cipher  to  be  considered  secure.  As  a  bare  minimum  it  should 

•  Have  a  long  period.  Since  the  keystream  ki  is  produced  via  a  deterministic  process  from 
the  key,  there  will  exist  a  number  N  such  that 

ki- )-7v 

for  all  values  of  i.  This  number  N  is  called  the  period  of  the  sequence,  and  should  be  large 
for  the  keystream  generator  to  be  considered  secure. 

•  Have  pseudo-random  properties.  The  generator  should  produce  a  sequence  which  appears 
to  be  random,  in  other  words  it  should  pass  a  number  of  statistical  random  number  tests. 

•  Have  large  linear  complexity  (see  Chapter  12  for  an  explanation). 

However,  these  conditions  are  not  sufficient.  Generally,  determining  more  of  the  sequence  from  a 
part  should  be  computationally  infeasible.  Ideally,  even  if  one  knows  the  first  one  billion  bits  of 
the  keystream  sequence,  the  probability  of  guessing  the  next  bit  correctly  should  be  no  better  than 
one  half. 

In  Chapter  12  we  shall  discuss  how  modern  stream  ciphers  can  be  created  using  a  combination 
of  simple  circuits  called  Linear  Feedback  Shift  Registers.  But  first  we  will  look  at  an  earlier 
construction  using  rotor  machines,  or  in  modern  nomenclature  Shift  Registers  (i.e.  shift  registers 
with  no  linear  feedback). 


10.3.  The  Lorenz  Cipher 

The  Lorenz  cipher  was  a  German  cipher  from  World  War  II  which  was  used  for  strategic  information, 
as  opposed  to  the  tactical  and  battlefield  information  encrypted  under  the  Enigma  machine.  The 
Lorenz  machine  was  a  stream  cipher  which  worked  on  streams  of  bits.  However  it  produced  not  a 
single  stream  of  bits,  but  five.  The  reason  was  due  to  the  encoding  of  the  teleprinter  messages  used 
at  the  time,  namely  Baudot  code. 

10.3.1.  Baudot  Code:  To  understand  the  Lorenz  cipher  we  first  need  to  understand  Baudot 
code.  We  all  are  aware  of  the  ASCII  encoding  for  the  standard  letters  on  a  keyboard,  which  uses 
seven  bits  for  the  data,  plus  one  bit  for  error  detection.  Prior  to  ASCII,  indeed  as  far  back  as  1870, 
Baudot  invented  an  encoding  which  used  five  bits  of  data.  This  was  further  developed  until,  by  the 
1930s,  it  was  the  standard  method  of  communicating  via  teleprinter.  The  data  was  encoded  via  a 
tape,  consisting  of  a  sequence  of  five  rows  of  holes/non- holes. 

Those  of  us  of  a  certain  age  in  the  United  Kingdom  can  remember  the  football  scores  being 
sent  in  on  a  Saturday  evening  by  teleprinter,  and  those  who  are  even  older  can  maybe  recall  the 
ticker-tape  parades  in  New  York:  the  ticker-tape  was  the  remains  of  messages  in  Baudot  code  that 
had  been  transmitted  between  teleprinters.  Those  who  can  remember  early  dial-up  modems  will 
recall  that  the  speeds  were  measured  in  Baud,  or  characters  per  second,  in  memory  of  Baudot’s 
invention. 

Since  five  bits  does  not  allow  one  to  encode  all  the  characters  that  one  wants,  Baudot  code 
used  two  possible  “states”  called  letters  shift  and  figures  shift.  Movement  between  the  two  states 
was  controlled  by  control  characters;  a  number  of  other  control  characters  were  reserved  for  things 
such  as  space  (SP),  carriage  return  (CR),  line  feed  (LF)  or  a  character  which  rang  the  teleprinter’s 
bell  (BELL)  (such  control  codes  still  exist  in  ASCII)1.  The  table  for  Baudot  code  in  the  1930s  is 
presented  in  Table  10.1.  Thus  to  transmit  the  message 

1A  line  feed  moves  one  line  down,  whereas  a  carriage  return  moves  the  cursor  to  the  beginning  of  a  line.  These 
two  teleprinter/typewriter  codes  still  cause  problems  today.  In  Windows,  text  files  use  both  codes  to  move  down  and 
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Please,  Please  Help! 

one  would  need  to  transmit  the  encoding,  which  we  give  in  hexadecimal, 

16,  12,  01,  03,  05,  01,  IB,  0C,  IF,  04,  16,  12,  01,  03,  05,  01,  04,  14,  01,  12,  16,  IB,  0D. 


Bits 

lsb 

in 

Code 

msb 

Hex 

Code 

Letters 

Shift 

Figures 
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Letters 

Table  10.1.  The  Baudot  code 


start  a  new  line  of  text,  whereas  Unix  systems  achieve  the  same  effect  by  just  using  a  line  feed  control  code.  This 
causes  problems  when  a  text  hie  is  moved  from  one  system  to  another. 
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10.3.2.  Lorenz  Operation:  The  Lorenz  cipher  encrypted  data  in  Baudot  code  form  by  producing 
a  sequence  of  five  random  bits  which  was  exclusive-or’d  with  the  bits  representing  the  Baudot  code. 
The  actual  Lorenz  cipher  made  use  of  a  sequence  of  wheels,  each  having  a  number  of  pins.  The 
presence,  or  absence,  of  a  pin  signalled  a  one  or  a  zero  signal.  As  the  wheel  turned,  the  position  of 
the  pins  changed  relative  to  an  input  signal.  In  modern  parlance  each  wheel  corresponds  to  a  shift 
register. 

Consider  a  register  of  length  32  bits  or,  equivalently,  a  wheel  with  circumference  32.  At  each 
clock  tick  the  register  shifts  left  by  one  bit  and  the  leftmost  bit  is  output;  equivalently  the  wheel 
turns  around  1/32  of  a  revolution  and  the  topmost  pin  is  taken  as  the  output.  This  is  represented 
in  Figure  10.3.  In  Chapter  12  we  shall  see  shift  registers,  with  more  complex  feedback  functions, 
being  used  in  modern  stream  ciphers.  However,  it  is  interesting  to  see  how  similar  ideas  were  used 
such  a  long  time  ago. 
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Figure  10.3.  Shift  Register  of  32  bits 


In  Chapter  12  we  shall  see  that  the  main  problem  is  how  to  combine  the  more  complex  shift 
registers  into  a  secure  cipher.  The  same  problem  exists  with  the  Lorenz  cipher:  namely,  how  the 
relatively  simple  operation  of  the  wheels/shift  registers  can  be  combined  to  produce  a  cipher  which 
is  hard  to  break.  From  now  on  we  shall  refer  to  these  as  shift  registers  as  opposed  to  wheels. 


10.3.3.  The  Lorenz  Cipher’s  Wheels:  A  Lorenz  cipher  uses  twelve  registers  to  produce  the 
five  streams  of  random  bits.  The  twelve  registers  are  divided  into  three  subsets.  The  first  set,  called 

the  chi- wheels,  consists  of  five  shift  registers  which  we  denote  by  x'f  comprising  the  output  bit  of 
the  zth  shift  register  on  the  jth  clocking  of  the  register,  for  i  =  1,  2,  3, 4,  5.  The  five  x  registers  have 
lengths  41,  31,  29,  26  and  23,  thus 


Xt+ 41 


Xt 


(1) 


(2) 

Xt+31 


Xt 


(2) 


(3) 

Xt+29 


Xt 


(3) 


(4) 

Xt+ 26 


Xt 


(4) 


(5) 

Xt+ 23 


Xt 


(5) 


for  all  values  of  t.  The  second  set  of  five  shift  registers,  called  the  psi- wheels,  we  denote  by  ip^  for 
i  =  1,2,  3, 4,  5.  These  pj  registers  have  respective  lengths  43,  47,  51,  53  and  59,  i.e. 


JV 

43 


Ip 


(1) 


A 2) 

47 


Ip. 


(2) 


b/3) 

vT+51 

.(0 


Ip 


(3) 


b/4) 


Ip 


(4) 


h/5) 

VT+ 59 


Ip. 


(5) 


The  other  two  registers  we  shall  denote  by  (i  - 7  for  i  =  1,2;  these  are  called  the  motor  registers. 
The  lengths  of  the  fi  registers  are  61  and  37  respectively,  and  so 


Ft +61 


Ft 


(i) 


(2) 

Ft+37 


Ft 


(2) 


10.3.4.  Lorenz  Cipher  Operation:  To  describe  how  the  Lorenz  cipher  clocks  the  various  reg¬ 
isters,  we  use  the  variable  t  to  denote  a  global  clock,  which  will  be  ticked  for  every  Baudot  code 
character  which  is  encrypted.  We  also  use  a  variable  to  denote  how  often  the  pj  registers  have 
been  clocked,  and  a  variable  which  denotes  how  often  the  second  fi  register,  ii^\  has  been 
clocked.  To  start  the  cipher  we  set  t  =  =  0;  at  a  given  point  we  perform  the  following 

operations: 

(1)  Let  a  denote  the  vector  f=i* 

(2)  t  =  t  +  1. 
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(3)  If  =  1  then  set  +  1. 

(4)  If  /i®  =  1  then  set  +  1. 

(5)  Output  ft. 

The  first  line  of  the  above  produces  the  output  keystream,  the  third  line  clocks  the  second  fi  register 
if  the  output  of  the  first  fi  register  is  set  (once  it  has  been  clocked),  whilst  the  fourth  line  clocks 
all  of  the  ip  registers  if  the  output  of  the  second  fi  register  is  set.  From  the  above  it  should  be 
deduced  that  the  \  registers  and  the  first  fi  register  are  clocked  at  every  time  interval.  To  encrypt 
a  character  the  output  vector  ft  is  exclusive-or’d  with  the  Baudot  code  representing  the  character 
of  the  plaintext.  This  is  described  graphically  in  Figure  10.4.  Each  clocking  signal  is  depicted  as  a 
line  with  a  circle  on  the  end;  each  output  wire  is  depicted  by  an  arrow. 


Figure  10.4.  Graphical  representation  of  the  Lorenz  cipher 


The  actual  outputs  of  the  ip  and  fi  motors  at  each  time  step  are  called  the  extended-^  and  the 

jj\ 

extended-/!,  streams.  To  ease  future  notation  we  will  let  ip  \  denote  the  output  of  the  ip  registers 
at  time  £,  whilst  fi'^  will  denote  the  output  of  the  second  /x  register  at  time  t.  In  other  words,  for 


a  given  tuple  of  valid  clock  values  we  have  ip'[^  =  ip^  and  =  fip^ 


’(2)  _  „(2) 


10.3.5.  Example:  To  see  this  in  operation  consider  the  following  example,  where  we  describe  the 
state  of  the  cipher  with  the  following  notation: 

Chi:  11111000101011000111100010111010001000111 

1100001101011101101011011001000 
10001001111001100011101111010 
11110001101000100011101001 
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11011110000001010001110 

Psi:  1011000110100101001101010101010110101010100 

11010101010101011101101010101011000101010101110 
101000010011010101010100010110101110101010100101001 
01010110101010000101010011011010100110110101101011001 
01010101010101010110101001001101010010010101010010001001010 
Motor:  0101111110110101100100011101111000100100111000111110101110100 
0111011100011111111100001010111111111 


This  gives  the  states  of  the  y,  ip  and  fi  registers  at  time  t  =  =  0.  The  states  will  be  shifted 

leftwise,  and  the  output  of  each  register  will  be  the  leftmost  bit.  So  executing  the  above  algorithm 
at  time  t  —  0  the  output  first  key  vector  will  be 

( i\ 

1 


=  Xo  0  V’o 


1 

1 

1 


V1/ 

Since  /i^  =  1  we  clock  the  value,  and  since  fi{ 
t  =  1  the  state  of  the  Lorenz  cipher  becomes 


it 
1 
1 
0 

v°y 


(  o\ 
0 
0 
1 

V1/ 


(2) 


1  we  also  clock  the  value.  Thus  at  time 


Chi:  11110001010110001111000101110100010001111 

1000011010111011010110110010001 
00010011110011000111011110101 
11100011010001000111010011 
10111100000010100011101 

Psi:  0110001101001010011010101010101101010101001 

10101010101010111011010101010110001010101011101 
010000100110101010101000101101011101010101001010011 
10101101010100001010100110110101001101101011010110010 
10101010101010101101010010011010100100101010100100010010100 
Motor:  1011111101101011001000111011110001001001110001111101011101000 
1110111000111111111000010101111111110 


Now  we  look  at  what  happens  at  the  next  clock  tick.  At  time  t  =  1  we  now  output  the  vector 


r i  =  xi  ©  Vi) 


At 

1 

0 

1 

V1/ 


l  o\ 

1 

0 
1 

V1/ 


( 1\ 
0 
0 
0 

v°y 


But  now  since  is  equal  to  zero  we  do  not  clock  whilst  since  /n\  =  1  we  still  clock  the 
t^p  value.  This  process  is  then  repeated,  so  that  we  obtain  the  following  sequence  for  the  first  60 
output  values  of  the  keystream 


010010000101001011101100011011011101110001111111000000001001 

000100011101110011111010111110011000011011000111111101110111 

001010010110011011101110100001000100111100110010101101010000 

101000101101110010011011001011000110100011110001111101010111 

100011001000010001001000000101000000101000111000010011010011 


This  is  produced  by  exclusive-or’ing  the  output  of  the  x  registers,  which  is  given  by: 
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111110001010110001111000101110100010001111111100010101100011 

110000110101110110101101100100011000011010111011010110110010 

100010011110011000111011110101000100111100110001110111101010 

111100011010001000111010011111000110100010001110100111110001 

110111100000010100011101101111000000101000111011011110000001 

with  the  values  of  output  of  the  ip't  stream  at  time  £, 

101100001111111010010100110101111111111110000011010101101010 

110100101000000101010111011010000000000001111100101011000101 

101000001000000011010101010100000000000000000011011010111010 

010100110111111010100001010100000000000001111111011010100110 

010100101000000101010101101010000000000000000011001101010010 

To  ease  understanding  we  also  present  the  output  //^  which  is 

11110111100000111111111111111000000000001000010111111111111 

Recall  that  a  one  in  this  stream  means  that  the  ip  registers  are  clocked  whilst  a  zero  implies  they 
are  not  clocked.  One  can  see  this  effect  in  the  ip't  output  given  earlier. 


10.3.6.  Lorenz  Key  Size:  Just  like  the  Enigma  machine  the  Lorenz  cipher  has  a  long-term  key 
set-up  and  a  short-term  per  message  set-up.  The  long  term  key  is  the  state  of  each  register.  Thus 
it  appears  that  there  are  a  total  of 

241+31+29+26+23+43+47+51+53+59+61+37  _  2501 


states,  although  the  actual  number  is  slightly  less  than  this  due  to  a  small  constraint  which  will  be 
introduced  in  a  moment.  In  the  early  stages  of  the  war  the  i±  registers  were  changed  on  a  daily  basis, 
the  x  registers  were  changed  on  a  monthly  basis  and  the  i\)  registers  were  changed  on  a  monthly  or 
quarterly  basis.  Thus,  if  the  month’s  settings  had  been  broken  then  the  “day”  key  “only”  consisted 
of  at  most 

261+37 _ 2^ 


states.  As  the  war  progressed  the  Germans  moved  to  changing  all  the  internal  states  of  the  registers 
every  day. 

Given  these  “day”  values  for  the  register  contents,  the  per  message  setting  is  given  by  the 
starting  position  of  each  register.  Thus  the  total  number  of  message  keys,  given  a  day  key,  is  given 

by 

41  •  31  •  29  •  26  •  23  •  43  •  47  •  51  •  53  •  59  •  61  •  37  «  264. 

The  Lorenz  cipher  has  an  obvious  weakness  as  defined,  which  is  what  eventually  led  to  its 
breaking,  and  which  the  Germans  were  aware  of.  The  basic  technique  which  we  will  use  throughout 
the  rest  of  this  chapter  is  to  take  the  “Delta”  of  a  sequence  -  this  is  defined  as  follows,  for  a  sequence 
S  =  (s+=o: 

As  =  (s;  ©  si+i)£0. 

We  shall  denote  the  value  of  the  As  sequence  at  time  t  by  (As)*.  The  A  operator  is  very  important 
in  the  analysis  of  the  Lorenz  cipher  because 

4l)  =  xt]  © 

and 

Kti  i  =  xth  ©  (m12)  •  <++i)  ©  (+i2 1  - 1)  •  +/.)  > 

(A n)t  =  {xt  ©  Xt+ 1)  ©  (V |2)  •  (+,  ©  +,+i)) 

=  (A x)t  ©  (kb  •  (A</>+)  . 


so  that 
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(2) 

Now  if  Pr[//^  =  1]  =  Pr[(A^)t^  =  1]  =  1/2,  as  we  would  have  by  choosing  the  register  states 

uniformly  at  random,  then  with  probability  3/4  the  value  of  the  A k  stream  reveals  the  value  of 
the  Ay  stream,  which  enables  the  adversary  to  recover  the  state  of  the  y  registers  relatively  easily. 
Thus  the  Germans  imposed  a  restriction  on  the  key  values  so  that 

pr[Pt2)  =  1]  '  Pr[(AAv,  =  !]  ~  V2- 

(2) 

In  what  follows  we  shall  denote  these  two  probabilities  by  S  =  Pr [//};  ;  =  1]  and  e  =  Pr[(A =  1]. 
Finally,  to  hx  notation,  if  we  let  the  Baudot  encoding  of  the  message  be  given  by  the  sequence 
of  5-bit  vectors,  and  the  ciphertext  be  given  by  the  sequence  7,  then  we  have 

7 1  =  ®  *t- 

As  the  war  progressed  more  complex  internal  operations  of  the  Lorenz  cipher  were  introduced. 
These  were  called  “limitations”  by  Bletchley,  and  they  introduced  extra  complications  into  the 
clocking  of  the  various  registers.  We  shall  ignore  these  extra  complications  however  in  our  discus¬ 
sion. 

Initially  the  Allies  did  not  know  anything  about  the  Lorenz  cipher,  even  that  it  consisted  of 
twelve  wheels,  let  alone  their  period.  In  August  1941  the  Germans  made  a  serious  mistake:  they 
transmitted  almost  identical  messages,  of  roughly  4000  characters  in  length,  using  exactly  the  same 
key.  From  this  the  cryptanalyst  John  Tiltman  managed  to  reconstruct  the  key  of  roughly  4  000 
characters  that  had  been  output  by  the  Lorenz  cipher.  From  this  sequence  of  apparently  random 
strings  of  five  bits  another  cryptographer,  Bill  Tutte,  recovered  the  precise  internal  workings  of  the 
Lorenz  cipher.  The  final  confirmation  that  the  internal  workings  had  been  deduced  correctly  did 
not  come  until  the  end  of  the  war,  when  the  Allies  captured  a  Lorenz  machine  on  entering  Germany. 
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Having  determined  the  structure  of  the  Lorenz  cipher  the  problem  remained  of  how  to  break  it.  The 
attack  method  used  was  broken  into  two  stages.  In  the  first  stage  the  wheels  needed  to  be  broken: 
this  was  an  involved  process  which  only  had  to  be  performed  once  for  each  wheel  configuration. 
Then  a  simpler  procedure  was  produced  which  recovered  the  wheel  positions  for  each  message. 

We  now  explain  how  wheel  breaking  occurred.  The  first  task  was  to  obtain  with  reasonable 
certainty  the  value  of  the  sequence 

A 0  A 


for  different  distinct  values  of  i  and  j,  usually  i  —  1  and  j  =  2.  There  were  various  different 
ways  of  performing  this;  below  we  present  a  gross  simplification  of  the  techniques  used  by  the 
cryptanalysts  at  Bletchley.  Our  goal  is  simply  to  show  that  breaking  even  a  60  year  old  stream 
cipher  requires  some  intricate  manipulation  of  probability  estimates,  and  that  even  small  deviations 
from  randomness  in  the  output  stream  can  cause  a  catastrophic  failure  in  security. 

To  do  this  we  first  need  to  consider  some  characteristics  of  the  plain  text.  Standard  natural 
language  contains  a  larger  sequence  of  repeated  characters  than  one  would  normally  expect,  com¬ 
pared  to  the  case  when  a  message  is  just  random  gibberish.  If  messages  were  random  then  one 
would  expect 


Pr[(A0)t  ©  (A 07  =  0] 


for  any  z,j  E  {1,2,  3, 4,  5}.  However,  if  the  plaintext  sequence  contains  slightly  more  repeated 
characters  than  we  expect,  this  probability  would  be  slightly  more  than  1/2,  so  we  set 


Pr[(A0)t  ©  (A 0)t  =  0]  =  1/2  +  p. 


Due  to  the  nature  of  German  military  parlance,  and  the  Baudot  encoding  method,  this  was  appar¬ 
ently  particularly  pronounced  when  one  considered  the  first  and  second  streams  of  bits,  i.e.  i  —  1 
and  j  =  2. 
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There  are  essentially  two  situations  for  wheel  breaking.  The  first  (more  complex)  case  is  when 
we  do  not  know  the  underlying  plaintext  for  a  message,  i.e.  the  attacker  only  has  access  to  the 
ciphertext.  The  second  case  is  when  the  attacker  can  guess  with  reasonable  certainty  the  value 
of  the  underlying  plaintext  (a  “crib”  in  the  Bletchley  jargon),  and  so  can  obtain  the  resulting 
keystream. 

10.4.1.  Ciphertext  Only  Method:  The  basic  idea  is  that  the  sequence  of  ciphertext  Deltas, 

Ay  ^  0  Ay^ 

will  “reveal”  the  true  value  of  the  sequence 

A%W  0  Ay^. 

Consider  the  probability  that  we  have 

(12)  (A 7«)t  ©  (A 7(iN  =  (A X{i])t  0  (AX«)t. 

Because  of  the  relationship 

(A7«)t  0  (A 70t  =  (A0)t  0  (A 0)t  0  (A K«)t  0  (A W)t 

=  (A0)t  ©  (A0'))t  0  (Ax(i))t  ©  (AX(j))t 

®  (Pt2)  •  (^00  ©  (A^)^))  , 

equation  (12)  can  hold  in  one  of  two  ways: 

•  Either  we  have 

(A<yy  0(A<yy  =  o 

and 

/42)  •  ((AV>W)^  ©  (A00)  =  0. 

The  hrst  of  these  events  occurs  with  probability  1/2  + p  by  equation  (11),  whilst  the  second 
occurs  with  probability 

(1  -  S)  +  S  ■  (e2  +  (1  -  e)2)  =  1  -  2  •  e  •  S  +  2  •  e2  •  S. 

•  Or  we  have 

(A0)t  0  (A0N  =  1 

and 

/42)  •  ((A0K  ©  (A00)  =  1. 

The  hrst  of  these  events  occurs  with  probability  1/2  —  p  by  equation  (11),  whilst  the  second 
occurs  with  probability 

2  •  S  •  e  •  (1  —  e). 

Combining  these  probabilities  together  we  find  that  equation  (12)  holds  with  probability 

(1/2  +  p)  ■  (1  -  2  •  e  •  S  +  2  •  e2  •  S)  +  2  ■  (1/2  -  p)  ■  5  ■  e  •  (1  -  e) 

~  (1/2  +  p)  ■  e  +  (1/2  —  p)  ■  (1  —  e) 

=  l/2  +  p-(2-e  —  1), 

since  S  •  e  ~  1/2  due  to  the  key  generation  method  mentioned  earlier.  So  assuming  we  have  a 
sequence  of  n  ciphertext  characters,  if  we  are  trying  to  determine 

<*t  =  (AX(1))t  ©  (AX(2))t, 

i.e.  we  have  set  i  —  1  and  j  =  2,  then  we  know  that  this  latter  sequence  has  period  1271  =  41-31. 
Thus  each  element  in  this  sequence  will  occur  n/1271  times.  If  n  is  large  enough,  then  taking  a 
majority  verdict  will  determine  the  value  of  the  sequence  cy  with  some  certainty. 
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10.4.2.  Known  Keystream  Method:  Now  assume  that  we  know  the  value  of  k).  .  We  use  a 
similar  idea  to  above,  but  now  we  use  the  sequence  of  keystream  Deltas, 

and  hope  that  this  reveals  the  true  value  of  the  sequence 

This  is  likely  to  happen  due  to  the  identity 

(AK«)t  ©  (A^)t  =  (A*w)t  0  (A X«)t  ®  (Mf>  •  ((A^%  ®  (A^%))  . 

Hence  we  will  have 

(13)  (AKW)t  ©  (A/cW)*  =  (A x{%  ®  (A xU))t 

precisely  when 

Pb  •  ((AAN  ®  (A ipb))t^j  =  0. 

This  last  equation  will  hold  with  probability 

(1  -  S)  +  S  ■  (e2  +  (1  -  e)2)  =  1— 2-e-S  +  2-e2-S 

«  1  —  1  +  e  =  6, 

since  5  •  e  ~  1/2.  But  since  S  •  e  ~  1/2  we  usually  have  0.6  <  e  <  0.8,  thus  equation  (13)  holds  with 
a  reasonable  probability.  So  as  before  we  try  to  take  a  majority  verdict  to  obtain  an  estimate  for 
each  of  the  1271  terms  of  the  at  sequence. 

10.4.3.  Both  Methods  Continued:  Whichever  of  the  above  methods  we  use  there  will  still  be 
some  errors  in  our  guess  for  the  stream  at,  which  we  will  now  try  to  correct.  In  some  sense  we  are 
not  really  after  the  values  for  the  sequence  at,  what  we  really  want  is  the  exact  values  of  the  two 
shorter  sequences 

(A x(1))t  and  (A x(2))t> 

since  these  will  allow  us  to  deduce  possible  values  for  the  first  two  x  registers  in  the  Lorenz  cipher. 
The  approximation  for  the  at  sequence  of  1271  bits  we  now  write  down  in  a  41  x  31-bit  array,  with 
the  first  31  bits  in  row  one,  the  second  31  bits  in  row  two,  and  so  on.  A  blank  is  placed  in  the  array 
if  we  cannot  determine  the  value  of  this  bit  with  any  reasonable  certainty.  For  example,  assuming 
the  above  configuration  was  used  to  encrypt  the  ciphertext,  we  could  obtain  an  array  which  looks 
something  like  this: 

0-0—01—1—110—110-10—1-0-1 

010-0—10011-1 - 1-010-1—1 

-00-1—111001—1-1—1101—1001 

0-00—01 - 0 - 0-10-0—001 

-0-110-000-110-1000 - 1—10 

-1—0—1-100110111-01—01—1 

01-0-10111-001101 - 1-1-0— 

-01—0-0—0-100-00001-0—0 - 

1—1—10000 - 0 - 1 — 11- 

101-1-00-110—0-00—0-0—100 

1—11—000—0—0-1—1 - 1 

-1-0 - 011—1 - 1-01-01-0 

-1—1—1—00110-1-1-110-0-100- 

1 - 10-10-1—0100-010—1- 

0-0—011—0011—11-011 - 0- 

-0001-1-11—11011—1-010110— 
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- 10 - 1000-001-10 - 

0-00-1-11110 - 1111-1-01-11-0- 

- 01-1 - 11-111011-1—10— 

0-0-01—1—011—11—1-1—1001 

10-1—0-1-1-0010—01-0—10—10 

- 1-01-0-1— o-io— 1-001 

010-01-1-110011-11—1-0—1—1 

1-1-1-1000-11-010—01-0101-01-0 

1—10-00—110—0-0—011-00-10 

10-1-010-0 - 01 - 1-10-0 - 

-1—0-011-1001-0 - 01101011—1 

010-010-11-0—101—10—0—1—0- 

-01—1-0—1-01000-1—0-001-0 

-1-10—0—110-100001-0—10-110 

- 1-00001-00—00010010 - 10 

011-0101-110-1-011-101101 - 01 

01—1 - 100-01—01-01—0-01 

-011—1-000-1—10-00-0—1001-0 

1011-01—0—001—01-01—0 - 

—0—0-1-1—11 - 1 - 1-11 - 

- o - 1— o - o - 11-0—00- 

10—101—0 - 0-0-0—0—010 - 

-100—1-110 - 11-01-0—1— 

0100 - 1-1 - 01-1—1111—11— 

0-0001-1-1-0011011-1— 01011-0- 


Now  the  goal  of  the  attacker  is  to  fill  in  this  array,  a  process  known  at  Bletchley  as  “rectangling” , 
bearing  in  mind  that  some  of  the  zeros  and  ones  entered  could  themselves  be  incorrect.  The  point 
to  note  is  that,  when  completed,  there  should  be  just  two  distinct  rows  in  the  table,  each  the 
complement  of  the  other,  and  similarly  for  the  columns.  A  reasonable  method  to  follow  is  to  take 
all  rows  starting  with  zero  and  then  count  the  number  of  zeros  and  ones  in  the  second  element  of 
those  rows.  In  our  example  above,  we  find  there  are  seven  ones  and  no  zeros.  Doing  the  same  for 
rows  starting  with  a  one  we  find  there  are  four  zeros  and  no  ones.  Thus  we  can  deduce  that  the 
two  types  of  rows  in  the  table  should  start  with  a  10  and  a  01.  We  then  fill  in  the  second  element 
in  any  row  which  has  its  first  element  set.  We  continue  in  this  way,  first  looking  at  rows  and  then 
looking  at  columns,  until  the  whole  table  is  filled  in. 

The  above  table  was  found  using  a  few  thousand  characters  of  known  keystream,  which  allows 
(via  the  above  method)  the  simple  reconstruction  of  the  full  table.  According  to  the  Bletchley  doc¬ 
uments,  the  cryptographers  at  Bletchley  would  actually  use  a  few  hundred  characters  of  keystream 
in  a  known  keystream  attack,  and  a  few  thousand  in  an  unknown  keystream  attack.  Since  we  are 
following  rather  naive  methods  our  results  are  not  as  spectacular. 

Once  completed  we  can  take  the  first  column  as  the  value  of  the  (Ay^1^  sequence  and  the  first 
row  as  the  value  of  the  sequence.  We  can  then  repeat  this  analysis  for  different  pairs  of 

the  y  registers  until  we  determine  that  we  have 


Ax(1) 

AX(2) 

AX(3) 

Ax(4) 

Ax(5) 


00001001111101001000100111001110011001000, 

0100010111100110111101101011001, 

10011010001010100100110001111, 

00010010111001100100111010, 

01100010000011110010011. 
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From  these  Ay  sequences  we  can  then  determine  possible  values  for  the  internal  state  of  the  y 
registers. 


10.4.4.  Breaking  the  Other  Wheels:  So  having  “broken”  the  y  wheels  of  the  Lorenz  cipher, 
the  task  remains  to  determine  the  internal  state  of  the  other  registers.  In  the  ciphertext  only 
attack  one  now  needs  to  recover  the  actual  keystream,  a  step  which  is  clearly  not  needed  in  the 
known- keystream  scenario.  The  trick  here  is  to  use  the  statistics  of  the  underlying  language  again 
to  try  to  recover  the  actual  sequence.  We  first  de-y  the  ciphertext  sequence  7,  using  the  values 
of  the  x  registers  which  we  have  just  determined,  to  obtain 

T  =  i[l)  ©  xt] 

=  TWL 

We  then  take  the  Delta  of  this  [3  sequence 

(A j3)t  =  (A 4>)t  ©  (/42)  ■  (A ip)t3)  , 

and  by  our  previous  argument  we  will  see  that  many  values  of  the  A <p  sequence  will  be  “exposed” 
in  the  A/3  sequence.  Using  a  priori  knowledge  of  the  A <p  sequence,  for  example  that  it  uses  Baudot 
codes  and  that  natural  language  has  many  sequences  of  bigrams  (e.g.  a  space  always  follows  a  full 
stop),  one  can  eventually  recover  the  sequence  <p  and  hence  a.  At  Bletchley  this  last  step  was 
usually  performed  by  hand. 

So  in  both  scenarios  we  have  now  determined  both  the  %  and  the  k  sequences.  But  what  we  are 
really  after  is  the  initial  values  of  the  registers  pj  and  fi.  To  determine  these  we  de-y  the  resulting 
k  sequence  to  obtain  the  pj't  sequence.  In  our  example  this  would  reveal  the  sequence 


101100001111111010010100110101111111111110000011010101101010 

110100101000000101010111011010000000000001111100101011000101 

101000001000000011010101010100000000000000000011011010111010 

010100110111111010100001010100000000000001111111011010100110 

010100101000000101010101101010000000000000000011001101010010 

given  earlier.  From  this  we  can  then  recover  a  guess  as  to  the  sequence. 

11110111100000111111111111111000000000001000010111111111111.  .  . 


Note  that  it  is  only  a  guess;  it  might  occur  that  pj^  —  but  we  shall  ignore  this  possibility. 

(2) 

Once  we  have  determined  enough  of  the  fi'/  sequence  so  that  we  have  59  ones  in  it,  then  we  will 
have  determined  the  initial  state  of  the  pj  registers.  This  is  because  after  59  clock  ticks  of  the  p> 
registers  all  outputs  have  been  presented  in  the  pj'  sequence,  since  the  largest  ip  register  has  size 


59. 


All  that  remains  is  to  determine  the  state  of  the  /jl  registers.  To  do  this  we  notice  that  the 
li'^  sequence  will  make  a  transition  from  a  0  to  a  1,  or  a  1  to  a  0,  precisely  when  outputs  a 
one.  By  constructing  enough  of  the  stream  as  above  (say  a  few  hundred  bits)  this  allows  us  to 
determine  the  value  of  the  register  almost  exactly.  Having  recovered  we  can  then  deduce 

the  values  which  must  be  contained  in  from  this  sequence  and  the  resulting  value  of  /T®. 

According  to  various  documents,  in  the  early  stages  of  the  Lorenz  cipher-breaking  effort  at 
Bletchley,  the  entire  “Wheel  Breaking”  operation  was  performed  by  hand.  However,  as  time  pro¬ 
gressed  the  part  which  involved  determining  the  Ay  sequences  above  from  the  rectangling  procedure 
was  eventually  performed  by  the  Colossus  computer. 


10.5.  Breaking  a  Lorenz  Cipher  Message 

The  Colossus  was  the  world’s  first  programmable  digital  electronic  computer,  and  as  such  was  the 
precursor  of  all  modern  computers.  The  role  of  the  Colossus  was  vital  to  the  Allied  war  effort,  and 
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was  so  secret  that  its  very  existence  was  not  divulged  until  the  1980s.  The  Colossus  computer  was 
originally  created  not  to  break  the  wheels,  i.e.  to  determine  the  long-term  key  of  the  Lorenz  cipher, 
but  to  determine  the  per  message  settings,  and  hence  to  help  break  the  individual  ciphertexts. 
Whilst  the  previous  method  for  breaking  the  wheels  could  be  used  to  attack  any  ciphertext,  for 
it  to  work  efficiently  requires  a  large  ciphertext  and  a  lot  of  luck.  However,  once  the  wheels  are 
broken,  i.e.  we  know  the  bits  in  the  various  registers,  breaking  the  next  ciphertext  becomes  easier. 

To  break  a  message  we  again  use  the  trick  of  de-y’ing  the  ciphertext  sequence  7,  and  then 
applying  the  Delta  method  to  the  resulting  sequence  /?.  We  assume  we  know  the  internal  states  of 
all  the  registers  but  not  their  starting  positions.  We  shall  let  Si  denote  the  unknown  values  of  the 
starting  positions  of  the  five  %  wheels  and  (resp.  s^)  the  global  unknown  starting  position  of 
the  set  of  (fi  (resp.  fi)  wheels. 


Pt  =  It  ©  Xt+sp 
=  <j>t  ©  Ip  t+SQ  1 

and  then 

(A P)t  =  (A 4>)t+s^  ®  (/4(+v(a^N)  • 

We  then  take  two  of  the  resulting  five  bit  streams  and  exclusive-or  them  together  as  before  to 
obtain 


( a 


(hj 


(a 77  ©  (a 77 

(A 4>{i))t+s,  ©  (AtfO)  ©  ((aN%  ©  (A AN, 


0 


Using  our  prior  probability  estimates  we  can  determine  the  following  probability  estimate 


Pr[(cWJj)t  =  0]  «  1/2  +  p  •  (2  •  e  -  1) 


which  is  exactly  the  same  probability  we  had  for  equation  (11)  to  hold  true.  In  particular  we  note 
that  Pr  [a^\  —  0]  >  1/2,  which  forms  the  basis  of  this  method  of  breaking  into  Lorenz  ciphertexts. 

Let  us  fix  i  =  1  and  j  —  2.  On  assuming  we  know  the  values  for  the  registers,  all  we  need  do  is 
determine  their  starting  positions  si,  52.  We  simply  need  to  go  through  all  1271  =  41*31  possible 
starting  positions  for  the  first  and  second  x  registers.  For  each  one  of  these  starting  positions  we 
compute  the  associated  (a/1,2))*  sequence  and  count  the  number  of  values  which  are  zero.  Since  we 


have  Pr[a^J)  =  0]  >1/2  the  correct  value  for  the  starting  positions  will  correspond  to  a  particularly 
high  value  for  the  count  of  the  number  of  zeros. 

This  is  a  simple  statistical  test  which  allows  one  to  determine  the  start  positions  of  the  first 
and  second  x  registers.  Repeating  this  for  other  pairs  of  registers,  or  using  similar  statistical 
techniques,  we  can  recover  the  start  position  of  all  %  registers.  These  statistical  techniques  are 
what  the  Colossus  computer  was  designed  to  perform. 

Once  the  %  register  positions  have  been  determined,  the  determination  of  the  start  positions  of 
the  1/  and  fi  registers  can  then  be  performed  by  hand.  The  techniques  for  this  are  very  similar  to 
the  earlier  techniques  needed  to  break  the  wheels,  however  once  again  various  simplifications  occur 
since  one  is  assumed  to  know  the  state  of  each  register,  but  not  its  start  position. 


Chapter  Summary 


We  have  described  the  general  model  for  symmetric  ciphers,  and  for  stream  ciphers  in 
particular. 
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10.  HISTORICAL  STREAM  CIPHERS 


•  We  have  looked  at  the  Lorenz  cipher  as  a  stream  cipher,  and  described  its  inner  workings 
in  terms  of  shift  registers. 

•  We  sketched  how  the  Lorenz  cipher  was  eventually  broken,  in  particular  how  very  tiny 
deviations  from  true  randomness  in  the  output  were  exploited  by  the  Blecthley  cryptog¬ 
raphers. 


Further  Reading 

The  paper  by  Carter  provides  a  more  detailed  description  of  the  cryptanalysis  performed  at  Bletch- 
ley  on  the  Lorenz  cipher.  The  book  by  Gannon  is  a  very  readable  account  of  the  entire  operation 
related  to  the  Lorenz  cipher,  from  obtaining  the  signals  through  to  the  construction  and  operation 
of  the  Colossus  computer.  For  the  “real”  details  you  should  consult  the  General  Report  on  Tunny. 

F.L.  Carter.  The  Breaking  of  the  Lorenz  Cipher:  An  Introduction  to  the  Theory  Behind  the  Opera¬ 
tional  Role  of  “Colossus”  at  BP.  In  Cryptography  and  Coding  -  1997,  LNCS  1355,  74-88,  Springer, 
1997. 

P.  Gannon  Colossus:  Bletchley  Park’s  Greatest  Secret.  Atlantic  Books,  2007. 

J.  Good,  D.  Michie  and  G.  Timms.  General  report  on  Tunny,  With  Emphasis  on  Statistical  Methods. 
Document  reference  HW  25/4  and  HW  25/5,  Public  Record  Office,  Kew.  Originally  written  in  1945, 
declassified  in  2000. 


Part  3 

Modern  Cryptography  Basics 


In  this  part  we  cover  the  basic  components  of  modern  cryptographic  systems.  As  an  overview 
of  the  chapter  headings  will  show,  modern  cryptography  is  not  just  about  symmetric  encryption. 
We  have  other  symmetric  primitives  such  as  message  authentication  codes,  there  are  public  key 
primitives  such  as  public  key  encryption  and  digital  signatures,  and  there  are  keyless  primitives 
such  as  hash  functions. 

We  also  will  see  that  behind  each  of  these  primitives  is  a  notion  of  what  it  means  for  the  primitive 
to  be  secure.  This  is  the  main  distinction  between  cryptography  in  the  twenty-first  century  and  that 
which  preceded  it.  Modern  cryptography  is  as  much  about  defining  what  we  mean  by  something 
being  secure  as  it  is  about  actually  coming  up  with  something  that  achieves  that  security  goal. 


CHAPTER  11 


Defining  Security 


Chapter  Goals 

•  To  explain  the  notion  of  a  secure  pseudo-random  function  and  permutation. 

•  To  explain  the  various  notions  of  security  of  encryption  schemes,  especially  the  notion  of 
indistinguishability  of  encryptions. 

•  To  explain  the  various  attack  notions,  in  particular  adaptive  chosen  ciphertext  attacks. 

•  To  show  how  the  concept  of  non-malleability  and  adaptive  chosen  ciphertext  attacks  are 
related. 

•  To  explain  notions  related  to  the  security  of  signature  schemes,  message  authentication 
codes  and  other  cryptographic  mechanisms. 

•  The  chapter  also  introduces  some  basic  techniques  used  in  “security  proofs” . 

11.1.  Introduction 

Modern  cryptography  is  focused  on  three  key  aspects:  definitions,  schemes  and  proofs. 

•  Definitions:  The  first  challenge  modern  cryptography  addresses  is  to  actually  arrive 
at  a  concrete  mathematical  definition  of  what  it  means  for  a  particular  cryptographic 
mechanism  to  be  secure.  Whilst  we  may  have  a  conceptual  notion  that  encryption  should 
be  secure  as  long  as  the  key  is  not  revealed,  it  is  not  straightforward  to  define  this  precisely. 
We  will  also  see  that  modern  cryptography  is  about  more  than  just  encryption. 

•  Schemes:  Once  we  have  a  security  definition  for  a  specific  cryptographic  mechanism, 
we  need  to  design  schemes  which  it  is  hoped  will  meet  the  security  definition  we  have 
previous  defined.  For  example  we  might  want  to  build  an  encryption  scheme  whose  security 
intuitively  rests  on  the  difficulty  of  factoring  numbers. 

•  Proofs:  The  natural  question  to  ask  then  is  whether  the  design  meets  the  security  def¬ 
inition.  This  is  the  approach  of  “provable  security”,  or  more  accurately  “reductionist 
security” .  In  this  approach  we  design  a  scheme  based  on  certain  building  blocks;  for  ex¬ 
ample  a  building  block  could  be  the  difficulty  of  factoring  large  integers,  or  the  security 
of  a  specific  block  cipher.  Then  we  try  to  show  that  the  larger  scheme  meets  the  security 
definition  by  showing  that  if  it  did  not  one  could  also  “break”  the  simpler  component  (e.g. 
factor  numbers). 

This  chapter  is  devoted  to  defining  security,  which  in  this  book  will  be  done  in  the  same  way 
as  we  looked  at  the  factoring-related  and  discrete- logarithm-related  problems  in  Chapters  2  and  3. 
In  later  chapters  we  will  look  at  how  to  construct  the  mechanisms  described  in  this  chapter  from 
various  building  blocks. 

11.2.  Pseudo-random  Functions  and  Permutations 

The  security  games  from  Chapters  2  and  3  for  the  factoring-related  problems  and  the  discrete- 
logarithm-related  problems  will  form  the  cornerstone  of  what  we  will  call  public  key  cryptography. 
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11.  DEFINING  SECURITY 


For  symmetric  key  cryptography  we  need  to  look  at  another  basic  primitive  called  a  pseudo-random 
function,  or  PRF  for  short.  A  PRF  is  a  function  which  appears  to  be  random  to  the  adversary;  in 
other  words  the  adversary  cannot  predict  its  output. 

It  might  be  tempting  to  first  try  to  define  a  security  game  as  in  Figure  11.1,  for  a  function 
with  domain  D  and  codomain  C .  In  this  game  we  give  the  adversary  the  function  F,  and  then  pick 
either  a  random  pair  (x,  y),  if  b  =  0,  or  a  pair  depending  on  the  function  F,  ( x ,  F(x)),  if  b  =  1.  The 
adversary’s  goal  is  to  guess  whether  she  was  given  a  real  pair  associated  with  the  PRF  function,  or 
a  random  pair.  However,  this  game  is  easy  to  win  since  the  function  F  is  known  to  the  adversary. 
She  can  determine  the  bit  b  by  evaluating  F  on  x  and  checking  whether  it  equals  y.  If  so,  then 
the  bit  is  highly  likely  to  be  one,  otherwise  the  bit  is  zero.  But  we  want  to  be  able  to  give  the 
function  to  the  adversary,  so  that  she  can  inspect  the  code  etc.  So  we  seem  to  be  stuck  if  we  give 
the  adversary  the  function. 


Figure  11.1.  First  attempt  at  a  security  game  for  a  PRF 

The  way  we  solve  this  conundrum  is  by  looking  at  Kerckhoffs’  principle  again.  Recall  that  this 
decrees  that  the  security  of  the  public  algorithm  should  rest  entirely  on  the  secrecy  of  the  key.  So 
instead  of  looking  at  a  single  function  F  we  define  a  family  of  pseudo-random  functions  which  are 
indexed  by  a  key  k  chosen  from  some  set  K.  This  is  much  like  what  we  were  trying  to  achieve 
using  the  Lorenz  stream  cipher  in  Chapter  10,  only  where  (unlike  in  World  War  II)  the  adversary 
starts  by  knowing  the  function  family  but  not  the  key,  and  each  key  defines  a  new  random  function 
which  should  appear  to  the  adversary. 

This  allows  us  to  define  the  security  of  a  pseudo-random  function  family  {F^k  according  to 
the  game  given  in  Figure  11.2.  We  assume  that  the  adversary  is  given  a  description  of  the  function 
family,  but  not  the  specific  function  selected  from  the  family.  In  our  pictures  the  family  is  denoted 
by  {F1^}^,  whereas  the  actual  function  selected  is  denoted  by  F^.  We  also  assume  that  each 
function  has  the  same  domain  D  and  codomain  C,  and  that  the  index  k  is  chosen  from  a  set  K . 
The  adversary’s  goal  is  to  determine  whether  the  function  input/output  pair  she  receives  is  from 
the  real  PRF-family,  {FA}k,  or  from  a  random  function.  Note,  we  do  not  give  the  key  k  to  the 
adversary. 

However,  this  game  gives  far  too  much  power  to  the  challenger.  It  is  highly  unlikely  that  an 
adversary  will  be  able  to  win  this  game  except  against  very  simple  PRF  families.  And  there  are 
function  families,  for  example  taking  K  —  D  —  C  —  {0,  l}n  and  F ^(x)  =  k@x,  for  which  the  game 
is  unwinnable  for  even  infinitely  powerful  adversaries. 

To  give  the  adversary  a  fighting  chance,  we  would  like  to  give  her  the  ability  to  ask  multiple 
queries  on  inputs  x  of  her  choosing.  This  will  balance  up  the  abilities  of  the  adversary  and  the 
challenger,  and  hence  make  a  more  meaningful  security  definition  for  future  use.  So  a  better 
definition  is  given  by  the  game  in  Figure  11.3;  note  that  here  the  adversary  can  now  ask  various 
queries  x  of  her  choosing.  We  denote  this  in  our  diagrams  by  saying  the  adversary  has  access  to 
an  oracle  Opk  which  she  can  call  multiple  times,  and  we  place  the  “code”  for  such  oracles  on  the 
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Figure  11.2.  Second  attempt  at  a  security  game  for  a  PRF 

right  of  the  adversary.  These  “oracles”  are  subroutines  that  we  give  to  the  adversary;  they  provide 
access  to  functions  on  data  which  can  be  adaptively  chosen  by  the  adversary.  However,  in  our  PRF 
game  we  have  to  be  a  little  careful  to  avoid  giving  different  random  answers  in  the  case  of  b  =  0 
when  the  adversary  asks  the  same  x  twice.  Thus  to  avoid  two  different  answers  being  given  for  the 
same  query  x,  we  record  the  answers  provided  via  means  of  the  state  C. 

C  ^  {} 

— 'Fk  ►  If  3(x,  y ')  G  C  then  y  4—  y’ 
else  if  b  =  0  then  y  4—  C 
else  y  4—  Fk(x) 

C  4—  C  U  (x,  y) 

-  y 


Win  if  b'  =  b 


{ Fk}x  - * 

b  {0, 1} 

k^K 

A 

b'  - - 

Figure  11.3.  The  final  security  game  for  a  PRF 


We  define  the  advantage,  much  like  we  did  for  the  Decision  Diffie-Hellman  problem,  as  follows, 

1 

"  2 


Adv™;} 


K 


A)  =  2  • 


k  wins 


where  we  use  the  superscript  Opk  to  indicate  that  the  adversary  has  access  to  an  oracle  which 
computes  the  function  in  the  forwards  direction.  Each  call  to  the  Opk  oracle  is  counted  as  one  time 
step,  and  so  the  number  of  oracle  calls  is  bounded  above  by  the  running  time  of  the  adversary.  If 
we  want  to  make  explicit  the  number  of  oracle  calls  we  denote  it  by  qoF  and  write  the  advantage 

k 

as  Ad vp{^]K(A;qoFk). 

Recall  that  as  this  is  a  decision  game  we  need  to  subtract  1/2  from  the  probability  of  the 
adversary  winning,  since  she  could  just  guess  the  bit  b  and  still  win  with  probability  1/2.  Thus  by 
subtracting  1/2  we  ensure  the  advantage  measures  the  extra  power  the  adversary  has  over  random 
guessing.  In  addition,  we  have  the  following  analogue  of  Lemma  2.3, 

Lemma  11.1.  Let  A  be  an  adversary  against  the  PRF  security  of  the  function  family  {F^}k,  then 
if  b'  is  the  bit  chosen  by  A  and  b  is  the  bit  chosen  by  the  challenger  in  the  game ,  we  have 


AdvWJ 


k(A)  =  Pr  [b'  =  1  |  b  =  1]  —  Pr  [b'  =  1  |  b  =  0] 
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An  important  concept  related  to  a  PRF  is  that  of  a  pseudo-random  permutation  (PRP).  In  this 
concept  the  domain  and  codomain  are  the  same  set  Zl,  and  the  function  is  one-to-one.  Since  an 
adversary  can  tell  a  PRF  from  a  PRP  if  she  discovers  that  the  function  is  not  one-to-one,  we  define 
the  PRP  security  game  as  in  Figure  11.4. 


£  {} 


If  3(x,  y')  G  C  then  y  y' 
else  if  b  =  1  then  y  F^(x) 
else  (repeat  y  D  until  fl(x',y)  G  C) 
C  <—  C  U  (x,  y) 

y 


Figure  11.4.  The  security  game  for  a  PRP 


Just  as  above  we  define  the  advantage  in  the  obvious  way,  using  the  following  notation: 


AddF}UT  =  2- 


Pr  [A°Fk  wins 


1 

2 


where  we  use  the  superscript  Opk  to  indicate  that  the  adversary  has  access  to  an  oracle  which 
computes  the  permutation  in  the  forwards  direction.  We  can  also  define  a  game  in  which  the 
adversary  gets  access  to  an  additional  oracle  which  enables  her  to  invert  the  permutation  on  elements 
of  her  choice  as  well.  We  denote  the  advantage  then  by  the  notation 


AddF}UY  =  2- 


r  oF]  ,e>  i 

Pr[A  Fk  wins 


1 

2 


As  an  exercise  you  should  draw  a  picture,  as  above,  to  describe  the  game  that  the  adversary  plays, 
and  define  the  consistency  checks  that  the  challenger  needs  to  perform  in  the  case  of  b  =  0. 

Notice  how  we  defined  PRF  security  with  the  adversary  unable  to  tell  the  difference  between 
accessing  the  PRF  and  accessing  a  random  function,  and  we  defined  PRP  security  with  the  adver¬ 
sary  unable  to  tell  the  difference  between  accessing  the  PRP  and  accessing  a  random  permutation. 
In  both  games  the  adversary  has  essentially  the  same  interface  to  both  the  game  and  its  oracle. 
An  interesting  question  which  immediately  comes  to  mind  is  whether  an  adversary  can  tell  the 
difference  between  a  PRP  family  and  a  PRF  family,  when  the  PRP  and  PRF  family  have  the  same 
domain  and  codomain.  We  can  think  of  taking  the  same  adversary  A  and  placing  her  in  the  PRF 
game  or  the  PRP  game;  we  can  then  ask,  can  she  tell  the  difference? 

To  answer  this  question  we  need  to  define  what  we  mean  by  “tell  the  difference” .  For  this  we 
mean  that  the  adversary  behaves  differently;  but  the  only  thing  the  adversary  does  is  output  a  bit 
b' .  Thus  we  can  think  of  telling  the  difference  as  meaning  that  there  will  be  some  difference  in  the 
advantage  of  the  adversary  A  in  one  game  compared  to  the  other.  Thus  we  want  to  bound 


Adv{pi:>UA;  «)■ - AdV(X>*(4 «) 


for  some  PRP  family  {F^}x  with  domain/codomain  D. 

Lemma  11.2  (PRP  -PRF  Switching  Lemma).  Let  A  be  an  adversary  and  {F^k  be  a  family  of 
pseudo-random  permutations  with  domain  and  codomain  equal  to  D,  then 
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Proof.  Suppose  we  run  A  in  the  PRF  game,  and  we  let  E  denote  the  event  that  the  oracle  called 
by  A  returns  the  same  value  in  the  codomain  for  two  distinct  input  values.  We  have 

Pr [A  wins  the  PRF  game]  =  Pr [A  wins  the  PRF  game  |  E]  •  Pr[E] 

+  Pr  [A  wins  the  PRF  game  |  -i E ]  •  Pt[-^E] 

<  Pr[E]  +  Pr  [A  wins  the  PRF  game  |  -i E ] 

=  Pr[E]  +  Pr  [A  wins  the  PRP  game  |  —iE\. 

Now  we  note  that  the  probability  of  E  occurring  is,  from  the  birthday  bound,  at  most  q2  / (2  •  \D\). 
So  we  have,  where  we  assume  the  probability  of  A  winning  is  always  at  least  1/2  (which  we  can  do 
without  loss  of  generality), 


(A-,q) 


Add*r}K 


(A  <7) 


2  • 


Pr  [A  wins  the  PRF  game 


1 

2 


2  • 


1 

Pt[A  wins  the  PRP  game] - 

2 


2  •  Pr  [A  wins  the  PRF  game]  —  1 

—  2  •  Pi[A  wins  the  PRP  game]  +  1 
<  2  •  Pt[E] 


< 


D 


by  above 
by  the  birthday  bound. 


□ 


11.3.  One-Way  Functions  and  Trapdoor  One-Way  Functions 

The  discrete  logarithm  problem  is  an  example  of  a  one-way  function:  we  give  the  adversary  a  public 
function  (in  this  case  the  function  f(x)  =  gx)  and  ask  her  to  invert  the  function  on  an  element 
of  the  challenger’s  choosing.  In  terms  of  pictures  this  is  given  by  Figure  11.5,  for  a  function 
with  domain  D  and  codomain  C.  Notice  the  similarity  between  this  diagram  and  Figure  3.1.  Note 
that  we  do  not  need  to  give  the  adversary  an  oracle  to  query  values  of  F  of  her  choosing  since  the 
function  F  is  given  to  the  adversary.  We  define  the  advantage  Adv^WF(A)  in  the  usual  way  as 

Adv£WF(A)  =  Pr  [A  wins  the  OWF  game] . 

Notice  how  we  can  recast  the  discrete  logarithm  problem  as  an  example  of  a  one-way  function. 
Thus  the  DLP  security  game  is  the  same  as  the  OWF  game  for  the  specific  function  F  :  x  — >  gx  in 
the  group  generated  by  g. 


F  - 

X  i —  D 
h  F(x) 


Win  if  F(x')  =  h 


Figure  11.5.  Security  game  for  a  one-way  function 
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Now  look  again  at  another  of  our  number-theoretic  functions:  the  RSA  problem  as  in  Figure 
2.2.  Recall  that  this  was  the  problem,  given  e  and  N  such  that  gcd  (e,  (p  —  1  )(q  —  1))  =  1  and  a 
value  y,  to  find  x  such  that 

xe  =  y  (mod  N). 

This  is  similar  to  the  problem  of  inverting  the  one-way  function  given  by  Fye(^)  =  xe  (mod  iV); 
however  it  is  more  than  that.  The  function  Fn^(x)  in  the  RSA  problem  has  an  extra  property: 
there  exists  a  value  d  which  allows  one  to  efficiently  invert  the  function.  The  value  d  is  called  the 
trapdoor ,  and  such  functions  are  called  trapdoor  one-way  functions.  In  fact,  even  more  is  true:  since 
the  RSA  function  Fjy,e(x)  =  xe  (mod  N)  acts  as  a  permutation  on  the  group  Z /7VZ,  we  actually 
have  a  trapdoor  one-way  permutation. 

We  can  think  of  the  RSA  game  as  the  one-way  function  game  above,  but  with  F  being  the  RSA 
function  for  some  specific  modulus  N  and  an  exponent  e.  It  is  believed  that  for  values  of  N  which 
are  a  product  of  two  randomly  generated  primes  of  roughly  the  same  size  that  Adv^^  ( A )  is  very 
small  indeed. 


11.4.  Public  Key  Cryptography 

Recall  that  in  symmetric  key  cryptography  each  communicating  party  needs  to  have  a  copy  of 
the  same  secret  key.  This  leads  to  a  very  difficult  key  management  problem,  since  in  a  set  of  n 
people  we  would  require  n  •  {n  —  l)/2  different  symmetric  keys  to  secure  all  possible  communication 
patterns.  In  public  key  cryptography  we  replace  the  use  of  identical  keys  with  two  keys,  one  public 
and  one  private,  related  in  a  mathematical  way. 

The  public  key  can  be  published  in  a  directory  along  with  the  user’s  name.  Anyone  who  then 
wishes  to  send  a  message  to  the  holder  of  the  associated  private  key  will  take  the  public  key,  encrypt 
a  message  under  it  and  send  it  to  that  key  holder.  The  idea  is  that  only  the  holder  of  the  private 
key  will  be  able  to  decrypt  the  message.  More  clearly,  we  have  the  transforms 

Message  +  Alice’s  public  key  =  Ciphertext, 

Ciphertext  +  Alice’s  private  key  =  Message. 

Hence  anyone  with  Alice’s  public  key  can  send  Alice  a  secret  message.  But  only  Alice  can  decrypt 
the  message,  since  only  Alice  has  the  corresponding  private  key. 

Public  key  systems  work  because  the  two  keys  are  linked  in  a  mathematical  way,  such  that 
knowing  the  public  key  does  not  allow  you  to  compute  anything  about  the  private  key;  think  of 
how  knowing  N  and  e  tells  us  nothing  about  d  in  the  RSA  problem.  But  knowing  the  private 
key  allows  you  to  unlock  information  encrypted  with  the  public  key;  again  thinking  of  the  RSA 
problem,  the  value  d  allows  us  to  invert  the  RSA  function. 

The  concept  of  being  able  to  encrypt  using  a  key  which  is  not  kept  secret  was  so  strange  it 
was  not  until  1976  that  anyone  thought  of  it.  The  idea  was  first  presented  in  the  seminal  paper  of 
Diffie  and  Heilman  entitled  New  Directions  in  Cryptography.  Although  Diffie  and  Heilman  invented 
the  concept  of  public  key  cryptography  it  was  not  until  a  year  or  so  later  that  the  first  (and  most 
successful)  system,  namely  RSA,  was  invented. 

The  previous  paragraph  describes  the  “official”  history  of  public  key  cryptography.  However,  in 
the  late  1990s  an  unofficial  history  came  to  light.  It  turned  out  that  in  1969,  over  five  years  before 
Diffie  and  Heilman  publicly  invented  public  key  cryptography,  a  cryptographer  called  James  Ellis, 
working  for  the  British  government’s  communication  headquarters  GCHQ,  invented  the  concept  of 
public  key  cryptography  (or  non-secret  encryption  as  he  called  it)  as  a  means  of  solving  the  key 
distribution  problem.  Ellis,  just  like  Diffie  and  Heilman,  did  not  come  up  with  a  system. 

The  problem  of  finding  such  a  public  key  encryption  system  was  given  to  a  new  GCHQ  recruit 
called  Clifford  Cocks  in  1973.  Within  a  day  Cocks  had  invented  what  was  essentially  the  RSA 
algorithm,  a  full  four  years  before  Rivest,  Shamir  and  Adleman  invented  RSA.  In  1974  another 
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employee  at  GCHQ,  Malcolm  Williamson,  invented  the  concept  of  Diffie-Hellman  key  exchange, 
which  we  shall  return  to  in  Chapter  18.  Hence,  by  1974  the  British  security  services  had  already 
discovered  the  main  techniques  in  public  key  cryptography. 

There  are  a  surprisingly  small  number  of  ideas  behind  public  key  encryption  algorithms,  which 
may  explain  why,  once  DifRe  and  Heilman,  and  Ellis,  had  the  concept  of  public  key  encryption,  two 
inventions  of  essentially  the  same  cipher  (i.e.  RSA)  came  so  quickly.  There  are  so  few  ideas  because 
we  require  a  mathematical  operation  which  is  easy  to  do  one  way  (i.e.  encrypt),  but  which  is  hard 
to  reverse  (i.e.  decrypt),  without  some  special  secret  information,  namely  the  private  key.  Such 
a  mathematical  function  is  exactly  an  example  of  a  trapdoor  one-way  function  mentioned  above, 
since  it  is  effectively  a  one-way  function  unless  one  knows  the  key  to  the  trapdoor.  We  shall  return 
to  methods  to  construct  public  key  encryption  algorithms  in  a  later  chapter;  for  now  we  are  just 
interested  in  the  abstract  concept. 

11.5.  Security  of  Encryption 

We  are  now  in  a  position  to  define  the  security  of  both  symmetric  key  and  public  key  encryption 
algorithms.  There  are  three  things  we  need  to  define: 

•  the  goal  of  the  adversary, 

•  the  types  of  attack  allowed, 

•  the  computational  model. 

The  first  of  these  corresponds  to  the  winning  condition  of  an  analogue  of  the  games  considered 
above  (i.e.  what  does  breaking  a  cryptosystem  mean?),  the  second  corresponds  to  which  oracles  we 
allow  the  adversary  to  access  (i.e.  what  powers  do  we  give  the  adversary?),  whilst  the  third  is  a 
little  more  complex,  so  we  will  return  to  it  at  the  end  of  this  chapter. 

11.5.1.  Basic  Notions  of  Security:  To  fix  notation  we  assume  an  encryption  scheme  is  defined 
for  a  message  space  P,  a  ciphertext  space  C  and  a  key  space  IK.  For  symmetric  algorithms  we 
denote  the  shared  key  by  k  £  IK.  For  a  public  key  algorithm  we  denote  the  pair  of  keys  by  (p t,st) 
where  si  £  IK  is  the  secret  key,  and  pt  is  the  associated  public  key.  When  generating  keys  for  our 
games  we  use  the  notation  k  <—  KeyGen()  for  symmetric  key  systems  and  (pt,st)  KeyGen()  for 
public  key  systems. 

We  denote  encryption  for  symmetric  key  algorithms  by  c  <—  e^(m),  with  decryption  given  by 
m  <—  (4(c).  For  public  key  algorithms  we  define  encryption  and  decryption  by  c  <—  ept(m)  and 
m  <—  dsi(c).  We  assume  that  decryption  always  works  for  validly  encrypted  messages,  i.e.  we  have 
for  symmetric  key  schemes  that 

\/k  £  IK,  Vm  £  P,  dfc(efc(ra))  =  m, 

with  a  similar  condition  holding  for  public  key  schemes. 


k  <—  KeyGen() 
m*  P 
c*  ek(m*)  — 

m!  - 

Win  if  m!  =  m* 


Figure  11.6.  Security  game  for  symmetric  key  OW-PASS 

As  a  basic  notion  of  security  we  take  the  idea  that  we  do  not  want  an  adversary  to  learn  the 
message  underlying  a  specific  ciphertext.  This  gives  us  the  notion  of  one-way  security,  which  in 
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its  basic  form  (for  symmetric  encryption)  is  given  by  the  game  in  Figure  11.6,  with  the  equivalent 
public  key  security  game  being  defined  by  Figure  11.7.  We  call  c*  the  challenge  ciphertext  and 
such  an  attack  is  called  a  passive  attack.  We  denote  the  security  game  by  OW-PASS,  as  a  short¬ 
hand  for  One  Way-PASSive  attack.  We  define  the  advantage  against  the  encryption  scheme  II  by 
AdvnW_PASS(A)  =  Pr[A  wins]. 

(p£,  sk)  <—  KeyGen()  _ 

pt  - > 

m*  <—  P 

c*  <—  epe(m*)  - ►  A 

m!  * - 

Win  if  m!  =  ra*  _ 


Figure  11.7.  Security  game  for  public  key  OW- PASS /OW- CPA 

On  its  own  this  is  a  very  weak  form  of  security  since  no  oracles  are  provided  to  the  adversary. 
In  other  words  the  adversary  has  very  limited  powers  with  which  she  can  attack  the  challenge 
ciphertext.  In  particular,  in  the  symmetric  key  case  the  adversary  is  only  allowed  to  see  one 
encrypted  message.  Thus  it  is  usually  the  case  that  the  minimum  security  game  also  gives  the 
adversary  access  to  an  encryption  oracle.  This  attack  is  called  a  chosen  plaintext  attack  (CPA), 
since  we  imagine  giving  the  adversary  access  to  a  (black)  box  which  performs  encryption  but  not 
decryption  on  plaintexts  of  her  choosing.  See  Figure  11.8  for  the  symmetric  key  case,  defining 
a  notion  called  OW-CPA  (for  One  Way-Chosen  Plaintext  Attack).  Notice  that  in  the  public  key 
passive  attack  case  the  adversary  already  has  this  ability  due  to  her  having  the  public  key;  thus 
the  OW-PASS  game  and  OW-CPA  game  are  equivalent  in  the  public  key  setting.  As  notation  for 
the  advantage  we  use  AdvnW_CPA(A)  =  Pr[A  wins]. 


k  <—  KeyGen() 
ra*  <—  P 
c*  <—  e&(m*) 

nn!  -* - 

Win  if  m!  —  ra* 


O 


e± 


c  <r-  ek(jn) 


Figure  11.8.  Security  game  for  symmetric  key  OW-CPA 

A  more  complex  notion  of  security  involves  also  allowing  the  adversary  the  ability  to  decrypt 
ciphertexts  of  her  choosing  via  a  decryption  oracle.  This  is  called  a  chosen  ciphertext  attack 
(CCA).  Clearly,  to  make  the  security  game  non-trivial,  we  have  to  stop  the  adversary  requesting 
the  decryption  of  the  challenge  ciphertext.  A  pictorial  definition  of  the  OW-CCA  attacks  in  the 
symmetric  and  public  key  settings  are  given  in  Figures  11.9  and  11.10.  As  notation  for  the  advantage 
we  use  AdvnW_CCA(A)  =  Pr[A  wins]. 

11.5.2.  Modern  Notions  of  Security:  The  above  simple  notion  of  “breaking”  an  encryption 
algorithm  is  not  very  good.  In  particular  it  does  not  allow  us  to  model  the  situation  where  an 
adversary  can  break  a  part  of  a  message,  but  not  all  of  it.  Such  situations  could  be  important 
in  the  real  world,  and  so  we  need  definitions  which  can  define  encryption  security  in  the  context 
when  the  adversary  should  not  be  allowed  to  obtain  any  information  about  the  plaintext.  There 
are  essentially  three  notions  of  security  which  we  need  to  understand: 
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k  <—  KeyGen() 
m*  <—  F 

c*  <—  e^(m*) - 

m!  -* - 

Win  if  m'  —  m* 

Figure  11.9.  Security  game  for  symmetric  key  OW-CCA 


O, 


e± 


*  c  <-  ek(m) 


If  c  =  c*  then  abort. 
-  m  <—  dk(c) 


(pi,  si)  <—  KeyGen() 

pi  - 

m*  <—  P 

c*  <-  ept(m*)  - 

m!  - 

Win  if  m!  —  m* 


A 

c  e  C 

°dBt 

If  c  =  c*  then  abort. 
m  <-  dsi{c) 


Figure  11.10.  Security  game  for  public  key  OW-CCA 


•  Perfect  security, 

•  Semantic  security, 

•  IND  security  (short  for  INDistinguishability  of  encryptions),  a.k.a.  polynomial  security. 

We  shall  discuss  each  of  these  notions  in  turn.  They  are  far  stronger  than  the  simple  notions  of 
either  recovering  the  private  key  or  determining  the  plaintext  which  we  have  considered  previously. 

Perfect  Security:  Recall  that  a  scheme  is  said  to  have  perfect  security,  or  information-theoretic 
security,  if  an  adversary  with  infinite  computing  power  can  learn  nothing  about  the  plaintext  given 
the  ciphertext.  Shannon’s  Theorem,  Theorem  9.4,  essentially  states  that  this  is  achieved  if  and 
only  if  the  key  is  as  long  as  the  message,  and  the  same  key  is  never  used  twice. 

The  problem  is  that  such  systems  cannot  exist  in  the  public  key  model,  since  the  encryption 
key  is  assumed  to  be  used  for  many  messages  (so  it  is  not  of  one-time  use).  In  addition,  for  both 
modern  public  key  and  symmetric  systems  we  want  to  use  a  short  key  to  encrypt  large  amounts 
of  data  (e.g.  a  movie).  Hence,  we  will  never  normally  use  a  system  for  which  the  key  is  as  long  as 
the  message.  Thus,  for  any  notion  of  security  in  the  real  world  the  notion  of  perfect  security  is  too 
strong. 

Semantic  Security:  Semantic  security  is  like  perfect  security  but  we  only  allow  an  adversary 
with  polynomially  bounded  computing  power.  Formally,  for  all  probability  distributions  on  the 
message  space,  whatever  an  adversary  can  compute  (in  polynomial  time)  about  the  plaintext  given 
the  ciphertext,  she  should  also  be  able  to  compute  without  the  ciphertext.  In  other  words,  having 
the  ciphertext  does  not  help  in  finding  out  anything  about  the  message.  This  is  intuitively  the 
equivalent  of  perfect  security  for  adversaries  whose  run  time  is  bounded  by  a  polynomial  function 
of  the  underlying  security  parameter  (i.e.  the  key  size). 

The  following  is  a  (very)  simplified  definition  which  we  use  purely  for  illustrative  purposes: 
suppose  that  the  information  we  wish  to  compute  on  the  message  space  is  a  single  bit,  i.e.  there  is 
some  function 


g  :  M  — >  {0, 1}. 
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We  assume  that  over  the  whole  message  space  we  have 

Pr  [g{m)  =  1]  =  Pr  [g(m)  =  0]  =  -, 

and  that  the  plaintexts  and  ciphertexts  are  all  of  the  same  length  (so  in  particular  the  length  of 
the  ciphertext  reveals  nothing  about  the  underlying  plaintext). 

We  model  the  adversary  as  an  algorithm  S  which  on  input  of  a  ciphertext  c,  encrypted  under 
the  symmetric  key  k,  will  attempt  to  produce  an  evaluation  of  the  function  g  on  the  plaintext  for 
which  c  is  the  associated  ciphertext.  The  output  of  S  will  therefore  be  a  single  bit  corresponding 
to  the  value  of  g. 

The  adversary  is  deemed  to  be  successful  if  the  probability  of  it  producing  a  correct  output 
is  greater  than  one  half.  Clearly  the  adversary  could  always  just  guess  the  bit  without  seeing  the 
ciphertext,  hence  we  are  saying  that  a  successful  adversary  is  one  which  can  do  better  after  seeing 
the  ciphertext.  We  therefore  define  the  advantage  of  the  adversary  S  as 


AdvnEM  (S)  =  2  • 


Pr[S(c)  =  g(dk(c )) 


1 

2 


A  scheme  is  then  said  to  be  semantically  secure  if  Adv^EM(A)  is  “small”  for  all  polynomial-time 
adversaries  S1. 


IND  Security:  The  trouble  with  the  definition  of  semantic  security  is  that  it  is  hard  to  show  that 
a  given  encryption  scheme  has  this  property.  Polynomial  security,  sometimes  called  indistinguisha- 
bility  of  encryptions,  or  IND  security  for  short,  is  a  much  easier  property  to  confirm  for  a  given 
system.  Luckily,  we  will  show  that  if  a  system  has  IND  security  then  it  also  has  semantic  security. 
Hence,  to  show  that  a  system  is  semantically  secure  all  we  need  do  is  show  that  it  is  IND  secure. 

A  system  is  said  to  have  indistinguishable  encryptions  if  no  adversary  can  win  the  following 
game  with  probability  greater  than  one  half.  The  adversary  will  run  in  two  stages: 

•  Find:  In  the  “find”  stage  the  adversary  produces  two  plaintext  messages  mo  and  mi,  of 
equal  length. 

•  Guess:  The  adversary  is  now  given  the  encryption  c*  of  one  of  the  plaintexts  m^  for 
some  secret  hidden  bit  b.  The  goal  of  the  adversary  is  to  now  guess  the  value  of  b  with 
probability  greater  than  one  half. 

Just  as  for  OW  security  we  can  define  analogous  notions  of  IND- PASS,  IND- CPA  and  IND- CCA  for 
both  symmetric  key  and  public  key  algorithms.  We  summarize  these  in  Figures  11.11  and  11.12, 
which  give  the  CCA  variants  for  symmetric  and  public  key  encryption  respectively.  The  simpler 
notions,  with  fewer  oracles,  can  be  derived  easily  by  the  reader. 


k  KeyGen() 

b  {0, 1} 


b'  - - 

Win  if  b'  =  b 

Figure  11.11.  Security  game  for  symmetric  key  IND-CCA 

^Note  that  this  is  a  very  simplified  definition  of  semantic  security. 
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-  rn  (4(c) 


11.5.  SECURITY  OF  ENCRYPTION 


207 


Note  that  we  denote  the  obtaining  of  the  encryption  of  mb  via  an  oracle  call,  called  the  LR- 
oracle  (for  left-right  oracle),  which  encrypts  either  the  left  or  right  input  value  depending  on  b.  In 
the  standard  IND  games  the  LR-oracle  may  only  be  called  once  by  the  adversary.  We  assume  in 
both  cases  that  the  adversary  may  call  the  encryption  and  decryption  oracles  in  both  the  find  and 
guess  phases,  i.e.  both  before  and  after  the  call  to  the  C\r  oracle.  If  on  calling  O |_r  the  challenger 
outputs  a  value  c*  which  the  adversary  has  already  passed  to  the  decryption  oracle,  then  it  should 
also  abort.  We  will  implicitly  assume  this  happens  in  all  our  diagrams  and  descriptions. 


(pi,  si)  <—  KeyGen() 

pi  - 

b  M0,1} 


b'  - - 

Win  if  br  =  b 

Figure  11.12.  Security  game  for  public  key  IND-CCA 
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-  c*  <-  ept(mb) 


^  If  c  =  c*  then  abort. 
-  m  <-  dsi(c) 


Since  the 
an  adversary 


adversary  A  could  always  simply  guess  the  hidden  bit  b,  we  define  the  advantage  of 
A  just  as  we  did  in  the  PRF  games  via 


Adv^D-pASS(,4)  =  2  • 


Pt[A  wins 


1 

2 


A  scheme  is  then  said  to  be  IND- PASS  secure  if  Advj^1  d_pass(A)  is  “small”  for  all  polynomial-time 
adversaries  A.  Similarly,  we  can  define  the  advantages  AdvnD_CPA(A)  and  Advj^1  d_cca(A),  and 

IND- CPA  and  IND-CCA  security. 

An  important  consequence  of  the  above  definitions  is  that  any  encryption  function  which  sat¬ 
isfies  IND-CPA  security  must  be  probabilistic  in  nature.  If  it  were  not  probabilistic  then  we  would 
have  the  following  attack:  In  the  guess  stage  the  adversary  can  pass  mi  to  her  encryption  oracle 
(or  encrypt  the  value  itself  in  the  case  of  public  key  algorithms)  so  as  to  obtain  ci,  the  encryption 
of  mi.  Then  the  adversary  tests  whether  c\  is  equal  to  c*.  Since  encryption  is  deterministic  this 
will  determine  whether  the  bit  is  zero  or  one. 


The  following  definition  is  the  accepted  definition  of  what  it  means  for  an  encryption  scheme  to  be 
secure. 

Definition  11.3.  A  symmetric  key  (resp.  public  key)  encryption  algorithm  is  said  to  be  secure  if 
it  is  semantically  secure  against  a  chosen  ciphertext  attack,  i.e.  for  all  poly-time  adversaries  A  the 
value  AdvnEM_CCA(A)  is  u small” . 

However,  usually  it  is  easier  to  show  security  under  the  following  definition. 

Definition  11.4.  A  symmetric  key  (resp.  public  key)  encryption  algorithm  is  said  to  be  secure  if 
it  is  IND-CCA  secure,  i.e.  for  all  poly-time  adversaries  A  the  value  Adv|^ d_cca(A)  is  “small”. 

These  two  notions  are  however  related.  For  example,  we  shall  now  show  the  following  important 
result. 

Theorem  11.5.  A  system  which  is  IND- PASS  secure  must  necessarily  be  semantically  secure 
against  passive  adversaries. 
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Proof.  We  proceed  by  contradiction.  Assume  that  there  is  an  encryption  algorithm  which  is  not 
semantically  secure,  i.e.  that  there  is  an  algorithm  S  with 

Adv^EM(S)  >  e 


for  some  “largish”  value  of  e.  But  we  also  assume  that  the  encryption  algorithm  is  IND-PASS  secure. 
We  shall  then  derive  a  contradiction.  To  do  this  we  construct  an  adversary  A  against  the  IND-PASS 
security  of  the  scheme,  which  uses  the  adversary  S  against  semantic  security  as  an  oracle. 

The  find  stage  of  adversary  A  outputs  two  messages  mo  and  m i  such  that 


g{m o)  ^  g{mi). 

Such  messages  will  be  easy  to  find  given  our  earlier  simplified  formal  definition  of  semantic  security, 
since  the  output  of  g  is  equiprobable  over  the  whole  message  space. 

The  adversary  A  is  then  given  an  encryption  q,  of  one  of  these  and  is  asked  to  determine  b.  In 
the  guess  stage  the  adversary  passes  the  ciphertext  q,  to  the  oracle  S.  The  oracle  S  returns  with 
its  best  guess  as  to  the  value  of  g{mb)  .  The  adversary  A  can  now  compare  this  value  with  g(m o) 
and  g{m\)  and  hence  output  a  guess  as  to  the  value  of  b. 

Clearly  if  S  is  successful  in  breaking  the  semantic  security  of  the  scheme,  then  A  will  be 
successful  in  breaking  the  polynomial  security.  So 

Adv^D"PASS(A)  =  Adv^EM(S)  >  e. 

But  such  an  A  is  assumed  not  to  exist,  since  the  scheme  is  IND-PASS  secure.  This  is  our  contra¬ 
diction,  and  hence,  IND-PASS  security  must  imply  semantic  security  for  passive  adversaries.  □ 


Note  that  with  a  more  complicated  definition  of  semantic  security  it  can  be  shown  that,  for  adver¬ 
saries,  the  notions  of  semantic  and  polynomial  security  are  equivalent. 


Let  II  be  a  symmetric  encryption  scheme,  then  we  have  the  following  implications 

n  is  IND-CCA  =>  n  is  IND-CPA  =>  II  is  IND-PASS, 

since  with  each  implication  moving  right  we  restrict  the  type  of  oracles  called  by  the  adversary.  We 
also  have  that 

n  is  ind-xxx  =>  n  is  ow-xxx, 

for  XXX  equal  to  PASS,  CPA  or  CCA.  For  this  implication  suppose  the  opposite  was  true,  i.e. 
that  the  scheme  is  IND-XXX  secure  but  not  OW-XXX  secure.  In  such  a  situation  we  know  there  is 
an  adversary  A  which  breaks  the  OW-XXX  security  of  the  scheme.  It  is  easy  to  see  that  such  an 
adversary  can  be  used  to  create  an  adversary  B  which  breaks  the  I N  D-XXX  security.  Let  us  see  how 
this  is  done  with  a  picture  for  the  symmetric  key  case:  In  Figure  11.13  the  adversary  B  contains 
A ,  i.e.  it  uses  A  as  a  subroutine2.  Clearly  B  wins  the  IND-XXX  game,  and  so  the  implication  holds. 

The  above  “proof”  is  typical  of  proofs  in  cryptography.  Notice  the  basic  idea:  We  take  an 
adversary  A  against  one  property,  and  then  place  it  inside  another  adversary  B  against  another 
property.  We  construct  the  adversary  B  via  “wiring”  up  adversary  A’s  oracle  queries  to  B's  oracles. 
Note  that  B  has  to  do  this  since  A  is  expecting  answers  to  its  queries,  to  which  B  has  to  respond. 
Then  we  use  the  answer  from  A  to  produce  an  answer  for  B.  Our  assumption  is  that  A  exists. 
However,  we  have  just  created  B  from  A.  Hence,  if  we  also  believe  that  B  does  not  exist  then  A 
cannot  exist  either.  In  terms  of  “code”  for  the  adversary  B  we  have 

•  mo,  mi  ~  P. 

•  Call  0|_R(mo,mi)  to  obtain  c*. 

•  Call  adversary  A  with  target  ciphertext  c*. 

•  If  A  makes  an  encryption  oracle  query,  pass  the  query  to  B' s  equivalent  oracle  and  respond 
to  A  with  the  returned  value. 

9 

We  give  the  diagram  for  the  CCA  case.  The  other  cases  are  left  to  the  reader. 
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Figure  11.13.  Constructing  an  IND-CCA  adversary  B  from  a  OW-CCA  adversary  A 

•  If  A  makes  a  decryption  oracle  query,  pass  the  query  to  F>’s  equivalent  oracle  and  respond 
to  A  with  the  returned  value. 

•  Eventually  A  will  return  a  value  m' . 

•  Set  b'  =  0  if  m!  =  mo  and  b'  =  1  otherwise. 

•  Return  b' . 

You  should  convince  yourself  that  the  above  “code”  corresponds  to  the  intuitive  description  in 
Figure  11.13  ,  as  we  will  use  both  code-based  and  diagrammatically  based  descriptions  as  the  book 
progresses.  In  addition  you  should  also  convince  yourself  that  the  above  “code”  creates  an  algorithm 
which  breaks  IND-CCA  security  given  an  algorithm  which  breaks  OW-CCA  security.  In  particular 
we  have 

AdvnW_CCA(Y  <  AdvnD_CCA(i?). 

Thus  if  II  is  IND-CCA  secure,  i.e.  the  advantage  of  B  is  very  small,  then  the  advantage  of  A  must 
be  very  small  as  well.  This  last  statement  holds  for  all  adversaries  A,  as  we  made  no  assumption 
on  A  bar  it  was  an  OW-CCA  adverary.  Thus  II  must  also  be  OW-CCA  secure.  Hence,  as  claimed 

n  is  IND-CCA  =>  n  is  OW-CCA. 

The  argument  can  be  extended  trivially  for  CPA  and  PASS  adversaries,  as  well  as  the  public  key 
case. 


11.6.  Other  Notions  of  Security 

The  above  are  the  standard  notions  of  security  for  encryption  schemes.  There  are  many  others, 
some  of  which  we  sketch  here. 

11.6.1.  Many  Time  Security:  In  the  IND  game,  if  we  allow  the  adversary  to  call  the  C\r  oracle 
many  times  then  we  denote  the  notions  by  Advn"IND"PASS(H)  etc.  Note  that  we  have  to  modify  the 
decryption  oracle  in  the  CCA  game  so  that  it  aborts  if  any  ciphertext  returned  by  the  C\r  oracle 
is  passed  to  the  decryption  oracle.  If  we  do  not  do  this  then  the  game  can  be  easily  won. 

In  the  symmetric  case  if  we  have  access  to  the  O i_r  oracle  for  many  calls  then  we  do  not  need 
access  to  the  Oek  oracle,  since  we  can  simulate  one  via  the  other  using  the  fact  that  Oek(m)  = 
thus  the  m-IND-PASS  game  is  equivalent  to  the  m-IND-CPA  game3.  Returning  to  our 
implications,  we  also  have  that 

n  is  m-IND-XXX  =>  n  is  IND-XXX, 

o 

In  the  public  key  case  we  never  need  the  Oe pe  oracle  in  any  case. 
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since  an  adversary  which  breaks  IND-XXX  clearly  also  trivially  breaks  m-IND-XXX,  by  just  making 
a  single  call  to  the  (P|_R  oracle.  Thus  we  have  the  following  theorem. 

Theorem  11.6.  Let  A  be  a  poly-time  adversary  against  the  IND-XXX  security  of  the  symmetric 
encryption  scheme  II,  then  there  is  a  poly-time  adversary  B  against  the  m-IND-XXX  security  of  IV, 
with 

Adv“XC)  =  AdvrIND-XXXT). 

Note  that  this  says  that  if  a  scheme  II  is  m-IND-XXX  secure,  i.e.  Ad vn"IND"xxx(£>)  is  “small”  for 
all  poly-time  B ,  then  it  is  also  IND-XXX  secure,  since  Adv[j  D_XXX(A)  will  then  also  be  small  for  all 
poly-time  A  by  the  above  theorem.  The  natural  question  to  ask  is  whether  the  other  implication 
holds.  Well  it  does,  but  not  as  “tightly”  as  one  would  like.  One  can  prove  the  following  result,  but 
it  requires  a  technique  called  a  hybrid  argument  which  is  a  little  too  advanced  for  this  book. 

Theorem  11.7.  Let  A  be  a  poly-time  adversary  against  the  m-IND-XXX  security  of  the  symmet¬ 
ric  encryption  scheme  II,  which  makes  gi_R  queries  to  its  C\r  oracle.  Then  there  is  a  poly-time 
adversary  B  against  the  IND-XXX  security  of  IV,  with 

Ad vg-|ND-xxx(A)  <  gLR  •  Adv[IND-xxx(5). 

Thus  security  in  the  IND-XXX  game  does  not  directly  translate  to  security  in  the  m-IND-XXX  game, 
as  it  all  depends  on  how  many  queries  to  the  (D\_r  oracle  we  allow. 

11.6.2.  Real-or- Random:  In  the  real-or-random  security  game  the  O lr  oracle  is  replaced  with 
an  OroR  oracle.  This  oracle  either  encrypts  the  asked  for  message  (when  b  =  1)  or  it  encrypts  a 
random  message  of  the  same  length  (when  b  =  0).  This  can  either  be  called  once  or  many  times, 
just  as  the  O lr  oracle  could  be  called  once  or  many  times.  This  leads  us  to  six  new  notions  of 
encryption  security  RoR-XXX  and  m-RoR-XXX  where  XXX  is  one  of  PASS,  CPA  or  CCA.  We  present 
these  diagrammatically  in  Figure  11.14  for  the  case  of  symmetric  key  RoR-CCA.  Note  that  in  this 
case  if  the  (9r0r  oracle  can  be  called  many  times  we  cannot  use  it  to  replace  the  Oek  oracle. 


k  <—  KeyGen() 

b  <-{0,1} 


b'  - - 

Win  if  b'  =  b 
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If  b  =  0  then  m!  <—  {0, 1}I 
else  m!  <—  m 
c*  <-  ek{m') 

c  <-  ek(rrt) 


If  c  =  c*  then  abort, 
m  <—  dk(c) 


Figure  11.14.  Security  game  for  symmetric  key  RoR-CCA 

It  turns  out  that  RoR  security  is  closely  related  to  IND  security.  In  particular  we  have  the 
following  in  the  symmetric  case  (the  analogous  result  can  be  shown  to  hold  in  the  public  key  case). 

Theorem  11.8.  A  symmetric  encryption  scheme  which  is  IND-CCA  secure  is  also  RoR-CCA  secure. 
In  particular,  if  A  is  an  adversary  against  RoR-CCA  security  for  a  symmetric  encryption  scheme 
II,  then  we  can  build  an  adversary  B  against  IND-CCA  security  of  IV  with 

Adv£oR-CCA(A)  =  AdvnD_CCA(£>). 

Proof.  Again  we  construct  a  proof  by  wiring  the  two  adversaries  together,  see  Figure  11.15.  The 
inner  adversary  (subroutine)  is  A  who  is  able  to  break  the  RoR  security.  From  this  we  construct 
an  adversary  B  which  breaks  the  IND  security  of  the  same  encryption  scheme  II. 
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Figure  11.15.  Constructing  an  IND-CCA  adversary  B  from  an  RoR-CCA  adversary 
A  (symmetric  case) 


The  advantage  statement  is  immediate  since  we  see  that  the  adversary  B  is  perfectly  simulating 
the  environment  of  A ,  i.e.  A  cannot  tell  whether  it  is  playing  its  game  against  its  normal  challenger 
or  is  playing  its  game  under  the  control  of  the  algorithm  B.  In  addition  the  probability  that  B 
wins  its  game  is  the  same  as  the  probability  that  A  wins  its  game.  For  those  who  find  deciphering 
the  diagram  difficult,  we  now  give  the  adversary  B  in  a  code-like  form: 

•  Call  adversary  A. 

•  If  A  makes  a  call  to  its  Oek  oracle  then  forward  the  query  to  F>’s  Oek  oracle,  and  then 
reply  to  A  with  the  response  from  FT s  oracle. 

•  If  A  makes  a  call  to  its  Odk  oracle  then  forward  the  query  to  FT s  Odk  oracle,  and  then 
reply  to  A  with  the  response  from  FT s  oracle. 

•  When  A  makes  a  query  to  its  (9r0r  oracle,  take  the  query  m  and  call  it  mi.  Generate 
a  random  message  m o  G  P  and  forward  the  pair  (mo, mi)  to  FT s  (P|_R  oracle.  Pass  the 
response  back  to  A  to  answer  her  query. 

•  When  A  terminates  with  a  bit  b' ,  return  this  as  FT s  answer. 

Notice  that  the  C\r  oracle  will  return  mi  when  b  =  1,  i.e.  when  the  game  of  A  says  the  real  message 
should  be  encrypted,  and  it  will  return  mo  (a  random  message  chosen  by  B)  when  6  =  0.  □ 

The  obvious  question  which  then  comes  to  mind  is  whether  the  implication  holds  the  other  way 
around,  i.e.  is  an  encryption  scheme  which  is  RoR-CCA  secure  also  IND-CCA  secure.  The  answer 
is  perhaps  unsurprisingly  yes,  but  what  is  surprising  on  first  sight  is  the  fact  that  we  “lose”  some 
security  as  the  next  result  shows. 

Theorem  11.9.  A  symmetric  encryption  scheme  which  is  RoR-CCA  secure  is  also  IND-CCA  secure. 
In  particular ,  if  A  is  an  adversary  against  IND-CCA  security  for  a  symmetric  encryption  scheme 
II,  then  we  can  build  an  adversary  B  against  RoR-CCA  security  of  li  with 

AdvnD'CCA(X)  =  2  •  Adv£oR"CCAT)- 

Proof.  The  argument  is  very  similar  to  the  previous  case.  The  reader  is  invited  to  draw  a  picture 
like  that  used  above.  For  sake  of  space  we  shall  only  give  the  code-based  description  of  B. 

•  Call  adversary  A. 

•  If  A  makes  a  call  to  its  Oek  oracle  then  forward  the  query  to  F>’s  Oek  oracle,  and  then 
reply  to  A  with  the  response  from  FT s  oracle. 
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•  If  A  makes  a  call  to  its  Odk  oracle  then  forward  the  query  to  B's  Odk  oracle,  and  then 
reply  to  A  with  the  response  from  ITs  oracle. 

•  When  A  makes  a  query  to  its  C\r  oracle,  take  the  query  (mo,  mi),  pick  a  random  bit 
t  G  {0, 1},  and  pass  mt  to  B's  (9r0r  oracle.  Pass  the  response  back  to  A  to  answer  its 
query. 

•  When  A  terminates  with  a  bit  b\  adversary  B  returns  one  if  b'  =  t  and  zero  otherwise. 

Let  us  analyse  the  advantage  of  B.  When  the  hidden  bit  6  =  0,  the  (Pr0r  oracle  of  B  will  return  a 
random  ciphertext.  When  we  return  this  to  A  it  has  no  information  about  whether  mo  or  m\  was 
encrypted,  because  neither  were.  Thus  all  that  A  can  do  is  return  a  random  response  67,  which  will 
result  in  B  being  correct  fifty  percent  of  the  time. 

When  the  hidden  bit  6=1  the  O r0r  oracle  of  B  will  return  the  valid  encryption  of  mt .  Thus  A 
will  return  67,  its  best  guess  as  to  whether  t  =  0  or  t  =  1.  So  when  6  =  1  we  are  actually  simulating 
the  game  for  adversary  A  perfectly. 

Picking  these  two  cases  apart  we  obtain 


Advn°R"CCA 


(B)  =  2 


=  2 


=  2 


=  2 


Pr[L>  wins 


1 


1 

Pr[L>  wins  |  6=1]*  Pr[6  =  1]  +  Pr[L>  wins  |  6  =  0]-  Pr[6  =  0] - 

2 


Pt[A  wins]  •  Pr[6  =  1]  +  Pr[67  /  t]  •  Pr[6  =  0] - 


Pi[A  wins 


1  1 

2  +  2 


1 

2 


1 

2 


Pi[A  wins 


1 

2 


1 


-  •  Advn  D"CCA(Y 
2 


□ 


11.6.3.  Lunchtime  Attacks:  Sometimes,  in  older  literature,  the  above  notion  of  a  CCA  attack  is 
called  an  adaptive  chosen  ciphertext  attack,  since  the  chosen  ciphertexts  passed  to  the  decryption 
oracle  can  depend  on  the  challenge  ciphertext  (but  cannot  be  equal  to  it  of  course).  This  is  to 
distinguish  it  from  an  earlier  notion  called  a  lunchtime  attack,  which  is  sometimes  denoted  CCA1. 
In  this  weaker  form  of  attack  the  adversary  only  gets  access  to  the  decryption  oracle  during  the 
“find”  stage  of  the  algorithm,  i.e.  before  the  (P|_R  oracle  is  called.  The  lunchtime  attack  models  the 
situation  in  which  the  adversary  has  access  to  a  black  box  which  performs  decryptions,  but  only 
at  lunchtime  whilst  the  real  user  of  the  box  is  “away  at  lunch” .  After  this,  at  some  point  “in  the 
afternoon” ,  she  is  given  a  challenge  ciphertext  and  asked  to  decrypt  it  or  to  learn  something  about 
the  underlying  plaintext,  on  her  own,  without  using  the  box. 

11.6.4.  Nonce-Based  Encryption:  It  should  be  clear  that  with  the  above  notions,  with  the 
exception  of  IND-PASS  in  the  symmetric  case,  an  encryption  algorithm  must  be  randomized.  In 
other  words  the  encryption  algorithm,  e&(m)  or  ep*(m),  must  be  a  randomized  algorithm.  To  see 
why  this  is  the  case  consider  the  following  “attack”  on  a  deterministic  symmetric  scheme. 

•  Generate  mo  and  mi  of  the  same  length. 

•  Obtain  c*  <—  e^m^)  via  a  call  to  (P|_R- 

•  Obtain  c  <—  e&(mo)  via  a  call  to  Oek. 

•  If  c  =  c*  then  output  zero,  otherwise  output  one. 
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This  attack  shows  us  that  a  deterministic  encryption  algorithm  cannot  be  IND-CPA  secure,  since 
the  above  adversary  runs  with  probability  one  of  winning. 

In  practice  this  is  a  bit  awkward  for  some  encryption  algorithms,  so  we  alter  the  underlying 
Application  Programming  Interface  (API)  slightly  and  enable  the  caller  of  the  encryption  algorithm 
to  supply  the  “randomness”,  as  opposed  to  the  randomness  coming  from  inside  the  encryption 
algorithm.  The  only  requirement  is  that  the  randomness  should  be  of  one-time  use,  and  hence 
should  be  a  number  used  once ,  or  a  nonce  for  short.  Thus  in  nonce-based  encryption  the  underlying 
encryption  scheme  is  assumed  to  be  deterministic,  with  the  “randomness”  coming  from  the  nonce 
only.  This  notion  is  mainly  used  for  symmetric  encryption  algorithms,  and  a  modification  of  the 
IND-CCA  security  game  for  nonce-based  encryption  is  given  in  Figure  11.16. 


k  <—  KeyGen() 
&<-{  0,1} 
AT<r-  0 


b'  - - 

Win  if  br  =  b 


mo,  mi  G  P,  n 


A  m  G  P,  n 
c  G  C 


Olr 

- ► 


If  n  G  Af  then  abort. 
Af  <—  Af  U  {n} 
c*  <-  ek(mb\n ) 


If  n  G  Af  then  abort. 
Af  <—  Af  U  {n} 
c  <—  efc(m;  n ) 

If  c  =  c*  then  abort, 
m  <—  dk(c) 


Figure  11.16.  Security  game  for  nonce-based  symmetric  key  IND-CCA 


For  the  equivalent  OW  security  notion  we  let  the  adversary  have  access  to  an  oracle  which  will 
provide  a  single  challenge  encryption,  using  a  nonce  of  the  adversary’s  choice,  but  for  a  message 
uniformly  picked  from  all  possible  messages  of  a  defined  length. 

11.6.5.  Data  Encapsulation  Mechanisms:  In  real-world  systems  one  often  uses  a  fixed  sym¬ 
metric  key  only  once,  to  encrypt  a  single  message.  In  this  case  we  do  not  have  a  symmetric 
encryption  scheme,  but  something  called  a  data  encapsulation  mechanism ,  or  DEM  for  short.  Since 
the  attacker  against  a  DEM  is  such  that  he  can  only  ever  get  one  ciphertext  from  the  legitimate 
user  we  restrict  the  IND-game  for  such  adversaries  to  a  single  call  to  the  0\_r  oracle  and  no  calls  to 
the  Oek  oracle.  In  the  absence  of  a  decryption  oracle  this  is  precisely  what  we  have  called  IND-PASS 
earlier. 

It  is  sometimes  reasonable  in  such  situations  to  allow  the  adversary  many  calls  to  the  Odk 
oracle,  since  whilst  the  adversary  may  only  have  one  ciphertext  she  could  trick  a  decryptor  into 
trying  to  decrypt  multiple  variants  of  the  ciphertext.  Such  an  adversary  is  called  an  ot-IND-CCA 
adversary  against  the  DEM,  to  signal  that  the  encryption  scheme  is  only  being  used  for  one  time. 
The  security  model  is  given  in  Figure  11.17. 

Unlike  the  case  of  other  encryption  schemes,  as  only  one  ciphertext  can  ever  be  obtained,  there 
is  no  need  for  the  encryption  algorithm  to  be  randomized.  Thus  a  DEM  is  one  of  the  few  times 
that  one  can  use  deterministic  encryption. 

11.6.6.  Non-malleability:  An  important  concept  related  to  an  encryption  scheme  is  that  of 
malleability.  An  encryption  scheme  is  said  to  be  non-malleable  if  given  a  ciphertext  c*  corresponding 
to  an  unknown  plaintext  ra*,  it  is  impossible  to  determine  a  valid  ciphertext  c  on  a  ‘related’  message 
m.  Note  that  ‘related’  is  defined  vaguely  here  on  purpose,  but  it  is  assumed  that  the  adversary 
knows  the  relation. 
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k  <—  KeyGen() 

b<r-  {0,1} 


b'  - - 

Win  if  b'  =  b 


mo,  mi  E  P 

A 

c  e  C 


0LR  , 


c*  «-  efe(mb) 


If  c  =  c*  then  abort, 
m  (4(c) 


Figure  11.17.  Security  game  ot-IND-CCA  for  a  DEM 


Thus,  in  the  symmetric  key  setting  a  malleability  attacker  M  is  one  who  takes  in  a  ciphertext 
c*  and  who  outputs  a  pair  {c,  /}  such  that 

4(c)  =  /(4(c*))  • 

In  the  public  key  setting  the  definition  is  the  same  except  that  the  equation  becomes 

<4e(c)  =  /(4t(c*))- 

In  both  cases  we  assume  the  function  f  is  a  non-trivial  bijection  whose  inverse  is  easy  to  compute. 
Non-malleability  is  important  due  to  the  following  result,  for  which  we  only  give  an  informal  proof 
based  on  our  vague  definition  of  non-malleability.  A  formal  proof  can  however  be  given,  with  an 
associated  formal  definition  of  non-malleability. 

Theorem  11.10.  A  malleable  encryption  scheme  is  not  OW-CCA,  and  is  hence  not  IND-CCA 
either. 

Proof.  We  give  the  proof  simultaneously  in  the  symmetric  and  public  key  settings.  Suppose  that 
a  scheme  is  malleable.  Our  goal  is  to  construct  an  adversary  A  against  the  OW-CCA  security  of 
the  encryption  scheme  using  the  malleability  adversary  M  as  a  subroutine.  Our  adversary  A  takes 
the  challenge  ciphertext  c*  and  passes  it  to  the  algorithm  M  which  breaks  malleability.  We  obtain 
{c,  /}  <—  M(c*).  The  adversary  then  passes  the  ciphertext  c  to  its  decryption  oracle.  Since  c^c* 
the  decryption  oracle  will  return  the  decryption  m  of  c.  We  now  apply  /-1(m)  to  obtain  m*,  thus 
breaking  the  OW  security  of  the  encryption  scheme.  □ 


11.6.7.  Plaintext  Aware:  If  a  scheme  is  plaintext  aware  then  we  have  a  very  strong  notion  of 
security  against  chosen  ciphertext  attacks.  A  scheme  is  called  plaintext  aware  if  it  is  computationally 
difficult  to  construct  a  valid  ciphertext  without  being  given  the  corresponding  plaintext  to  begin 
with.  Hence,  plaintext  awareness  implies  that  one  cannot  mount  a  CCA  attack,  since  to  write  down 
a  ciphertext  requires  you  to  first  know  the  plaintext,  thus  making  access  to  the  decryption  oracle 
redundant.  Thus  if  a  scheme  is  both  IND-CPA  and  plaintext  aware  then  it  will  also  be  IND-CCA. 
We  do  not  make  much  use  of  this  notion  in  this  book,  so  we  do  not  go  into  a  formal  game  to  explain 
its  definition. 

11.6.8.  Relations  Between  Security  Notions:  We  have  presented  a  number  of  different  defi¬ 
nitions  in  the  above  sections  for  encryption.  They  are  all  related  as  Figure  11.18  presents  for  some 
of  the  cases  of  symmetric  encryption.  The  values  on  each  arrow  relates  to  the  “loss”  in  security  as 
we  move  from  one  security  definition  to  another.  As  can  be  seen  IND-CCA  is  a  notion  which  implies 
all  of  the  others,  so  it  is  this  notion  which  we  use  as  the  “gold  standard”  for  defining  security. 
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1  1 


OW-CCA  BBSS  OW-CPA  BBS  OW-PASS 


Figure  11.18.  Relations  among  notions  for  symmetric  key  encryption 


11.7.  Authentication:  Security  of  Signatures  and  MACs 

Cryptography  is  more  than  the  production  of  encryption  schemes;  a  major  concern  in  the  subject 
is  that  of  authentication.  How  do  you  know  some  data  is  correct  (data  authentication),  and  how 
do  you  know  an  entity  is  who  they  claim  to  be  (entity  authentication).  Just  as  with  encryption  we 
can  define  symmetric  and  public  key  algorithms  to  obtain  authentication.  In  the  symmetric  setting 
these  are  called  message  authentication  codes  (or  MACs)  and  in  the  public  key  setting  these  are 
called  digital  signatures.  We  first  define  these  two  notions,  and  then  go  on  to  discuss  the  various 
security  notions. 

11.7.1.  Message  Authentication  Codes:  Suppose  two  parties,  who  share  a  secret  key,  wish  to 
ensure  that  data  transmitted  between  them  has  not  been  tampered  with.  They  can  then  use  the 
shared  secret  key  and  a  keyed  algorithm  to  produce  a  tag,  or  MAC,  which  is  sent  with  the  data. 
In  symbols  we  compute 

tag  =  Mac^(m) 

where 

•  Mac  is  the  tag-producing  function, 

•  /c  is  the  secret  key, 

•  m  is  the  message. 
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Note  we  do  not  assume  that  the  message  is  secret:  we  are  trying  to  protect  data  integrity  and  not 
confidentiality.  Producing  a  tag  on  its  own  is  not  enough;  we  need  a  way  to  verify  that  it  was 
produced  by  the  correct  key. 

Hence,  we  define  a  MAC  scheme  as  a  pair  of  keyed  public  algorithms  Mac^  and  Verify  ^  such 
that  for  a  given  message  m  on  application  of  the  function  Mac*,  to  m  we  obtain  a  value  t 

t  Macfc(m). 

The  tag  is  such  that  when  we  pass  it,  and  the  original  message,  to  the  verification  algorithm  Verify 
then  we  will  get  a  bit  v  signalling  a  valid  response. 

Verifyfc(£,  ra)  =  valid. 

The  idea  is  that  it  should  be  hard  for  the  adversary  to  get  such  a  valid  response,  unless  the  tag  has 
been  obtained  from  the  MAC  algorithm  Mac^  using  the  same  key  k. 

11.7.2.  Digital  Signature  Schemes:  Signatures  are  an  important  concept  of  public  key  cryp¬ 
tography;  they  also  were  invented  by  DifRe  and  Heilman  in  the  same  1976  paper  that  invented 
public  key  encryption,  but  the  first  practical  system  was  due  to  Rivest,  Shamir  and  Adleman.  The 
basic  idea  behind  public  key  signatures  is  as  follows: 

Message  +  Alice’s  private  key  =  Signature, 

Message  +  Signature  +  Alice’s  public  key  =  YES/NO. 

The  above  is  called  a  digital  signature  scheme  with  appendix.  Since  the  signature  is  appended 
to  the  message  before  transmission,  the  message  needs  to  be  input  into  the  signature  verification 
procedure.  Another  variant  is  the  signature  scheme  with  message  recovery,  where  the  message  is 
output  by  the  signature  verification  procedure,  as  described  in 

Message  +  Alice’s  private  key  =  Signature, 

Signature  +  Alice’s  public  key  =  YES/NO  +  Message. 

The  main  idea  is  that  only  Alice  can  sign  a  message,  which  could  only  come  from  her  since  only 
Alice  has  access  to  the  private  key.  On  the  other  hand  anyone  can  verify  Alice’s  digital  signature, 
since  everyone  can  have  access  to  her  public  key.  A  digital  signature  scheme  consists  more  formally 
of  two  algorithms: 

•  A  signing  algorithm  S  which  uses  a  secret  key  st, 

•  A  verification  algorithm  V  which  uses  a  public  key  pE 

In  the  following  discussion,  we  assume  a  digital  signature  with  appendix.  For  a  variant  with  message 
recovery  a  simple  change  to  the  following  will  suffice.  Alice,  sending  a  message  m,  calculates 

s  «-  Sigst(m) 

and  then  transmits  m,  s,  where  5  is  the  digital  signature  on  the  message  m.  Note  that  we  are  not 
interested  in  keeping  the  message  secret  here,  since  we  are  only  interested  in  knowing  who  it  comes 
from. 

The  receiver  of  the  signature  5  applies  the  verification  transform  Verifyp^  to  5.  The  output  is 
a  bit  v.  The  bit  v  indicates  valid  or  invalid,  i.e.  whether  the  digital  signature  is  good  or  not.  For 
correctness  we  require  that 

VerifyP{  (Sigse(m),  m)  =  valid, 
if  pt  is  the  public  key  associated  with  sE 
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11.7.3.  Security  Definitions  for  MACs  and  Digital  Signatures:  As  can  be  seen  from  the 
above  definitions  there  is  a  great  deal  of  similarity  between  MACs  and  digital  signatures.  We  will 
define  their  security  together,  just  as  we  did  for  symmetric  and  public  key  encryption,  which  will 
help  bring  out  the  similarity  in  greater  detail. 

First  let  us  see  intuitively  what  the  recipient,  Bob,  wishes  to  obtain  from  the  bit  v  returned  by 
either  the  MAC  or  signature  verification  algorithm  when  applied  to  a  (message  and)  tag/signature 
sent  from  Alice.  There  are  two  main  properties 

•  message  integrity  -  the  message  has  not  been  altered  in  transit, 

•  message  origin  -  the  message  was  sent  by  Alice, 

For  digital  signatures,  since  verification  is  a  public  operation  we  can  define  a  third  property 

•  non-repudiation  -  Alice  cannot  claim  she  did  not  send  the  message. 

To  see  why  non-repudiation  is  so  important,  consider  what  would  happen  if  you  could  sign  a  cheque 
and  then  say  you  did  not  sign  it. 

The  adversary  against  a  MAC  or  signature  algorithm  is  called  a  forger.  Just  like  we  had  OW, 
IND,  RoR  etc.  notions  for  encryption  algorithms,  there  are  many  notions  of  security  for  MAC  and 
signature  algorithms.  The  three  main  types  of  forgery  are: 

Total  Break:  The  forger  can  produce  M ACs/signatures  just  as  if  he  were  the  valid  key  holder. 
This  is  akin  to  recovering  the  secret / private  key  and  corresponds  to  the  similar  type  of  break  of  an 
encryption  algorithm. 

Selective  Forgery:  In  this  case  the  adversary  is  able  to  forge  a  MAC/signature  on  a  single 
message  of  the  challenger’s  choosing.  This  is  similar  to  the  ability  of  an  adversary  of  an  encryption 
algorithm  being  able  to  decrypt  a  message  but  not  recover  the  private  key,  i.e.  like  OW  security. 

Existential  Forgery:  In  this  case  the  adversary  is  able  to  forge  a  MAC/signature  on  a  single 
message  of  the  adversary’s  choosing.  This  message  could  just  be  a  random  bit  string,  it  does  not 
need  to  mean  anything.  It  can  be  considered  analogous  to  IND  security  of  encryption  schemes. 

In  practice  we  usually  want  our  schemes  to  be  secure  against  an  attempt  to  produce  a  selective 
forgery.  But  we  do  not  know  how  the  MAC/signature  scheme  is  to  be  used  in  real  life:  for  example 
it  may  be  used  in  a  challenge/response  protocol  where  random  bit  strings  are  MAC ’ed/signed  by 
various  parties.  Hence,  it  is  prudent  to  insist  that  any  MAC/signature  scheme  should  be  secure 
against  an  existential  forgery. 

Along  with  types  of  forgery  we  also  have  types  of  attack.  The  weakest  attack  is  that  of  a  passive 
attacker,  who  is  simply  given  the  public  key  (in  the  case  of  a  signature  algorithm),  or  a  verification 
oracle  (in  the  case  of  a  MAC  algorithm)  and  is  then  asked  to  produce  a  forgery,  be  it  selective  or 
existential. 


k  <—  KeyGen() 

ra* ,  t*  -* - 

Win  if  Verify/^t*,  ra*)  =  valid 
and  ra*  C 


(9|vlac^  C^CU{m} 

t  <—  Macfc(ra) 

^Verify* 

- ► 

-  v  i —  Verify^, (t,  ?7i ) 


ra  G  P 

A 

£,  ra  G  T  x  P 


Figure  11.19.  Security  game  for  MAC  security  EUF-CMA 
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The  strongest  form  of  attack  is  that  by  an  adaptive  active  attacker,  who  is  given  access  to  a 
MAC /signing  oracle  which  will  produce  valid  M ACs/signatures,  as  well  as  the  data  that  a  passive 
attacker  has  access  to.  The  goal  of  an  active  attacker  is  to  produce  a  M AC/signature  on  a  message 
which  she  has  not  yet  queried  of  her  MAC/signing  oracle.  This  leads  to  the  following  definition: 

Definition  11.11.  A  MAC / 'signature  scheme  is  deemed  to  be  secure  if  it  is  infeasible  for  an  adap¬ 
tive  adversary  to  produce  an  existential  forgery.  We  call  this  notion  EUF-CMA,  for  Existentially 
UnForgable  against  a  Chosen  Message  Attack. 

We  illustrate  these  notions  in  Figures  11.19  and  11.20,  where  we  let  P  be  the  message  space,  K  the 
private/secret  key  space  and  T/S  be  the  tag/signature  space. 


<—  KeyGen() 
pt  - 


ra  ,  s 


Win  if  Verifype(s*,  ra*) 
and  ra*  C 


valid 


Os 


£^0 

lgst' ».  C  <—  C  U  {rn} 
t  4-  Sig si{m) 


Figure  11.20.  Security  game  for  signature  security  EUF-CMA 


In  terms  of  advantage  statements  we  define  the  advantage  of  an  adversary  winning  the  EUF-CMA 
games  (involving  either  a  MAC  or  signature  algorithm  II)  as 

Adv^UF'CMA(A)  =  Pr[A  wins  the  EUF-CMA  game] . 

So  a  scheme  MAC/signature  scheme  II  is  secure  if  for  all  adversaries  the  associated  advantage  is 
“small” . 

As  well  as  the  above  notion  of  forgery  there  is  another  notion  which  is  often  used  in  cryptography 
called  strong  existential  unforgeability  against  a  chosen  message  attack,  or  sEUF-CMA.  This  notion 
applies  to  both  signature  and  MAC  schemes,  and  in  such  a  security  game  the  winning  condition  is 
changed  from  A  winning  if  she  outputs  a  MAC/signature  on  a  new  message,  to  one  where  she  wins 
if  she  outputs  a  new  message/signature  pair.  We  illustrate  this  for  MAC  schemes  in  Figure  11.21 


k  <—  KeyGenQ 


m*,t*$ - 

Win  if  Verify^*,  ra*)  =  valid 
and  (£*,  ra*)  </  C 


m  G  P 
A 

t,  m  G  T  x  P 
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t  <—  MacMra) 

C  <—  C  U  {(£,  ra)} 
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v  <—  Verify^,  ra) 


Figure  11.21.  Security  game  for  MAC  security  sEUF-CMA 


We  end  this  section  by  noting  the  following  simplification  of  the  MAC  and  digital  signature 
security  games.  In  many  situations  a  key  is  only  used  once  by  the  legitimate  party.  In  the  signature 
setting  these  are  called  one-time  signatures,  however  we  will  also  see  a  use  for  one-time  MACs  when 
we  consider  the  construction  of  data  encapsulation  mechanisms  later.  In  such  a  situation  we  can 
limit  the  adversary  to  only  making  one  query  of  her  MAC/signing  oracle,  and  we  call  the  resulting 
security  notion  ot-EUF-CMA. 
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11.8.  Bit  Security 

Earlier  we  looked  at  decision  problems,  i.e.  those  problems  which  output  a  single  bit.  An  interesting 
question  is  whether  computing  a  single  bit  of  the  input  to  a  one-way  function  is  as  hard  as  computing 
the  entire  input.  For  example  suppose  one  is  using  the  RSA  function 

Ejv,e  :  x  1 — >  V  =  x&  (mod  N). 

It  may  be  that  in  a  certain  system  the  attacker  only  cares  about  computing  b  =  x  (mod  2)  and 
not  the  whole  of  x.  We  would  like  it  to  be  true  that  computing  even  this  single  bit  of  information 
about  x  is  as  hard  as  computing  all  of  x.  In  other  words  we  wish  to  study  the  so-called  bit  security 
of  the  RSA  function.  We  can  immediately  see  that  bit  security  is  related  to  semantic  security. 
For  example  if  an  attacker  could  determine  the  parity  of  an  underlying  plaintext  given  only  the 
ciphertext  she  could  easily  break  the  semantic  security  of  the  encryption  algorithm. 

To  formalize  this  we  first  define  some  notation. 

Definition  11.12.  Let  f  :  S  — )>  T  be  a  one-way  function  where  S  and  T  are  finite  sets  and  let 
B  :  S  {0, 1}  denote  a  binary  function  (called  a  predicate).  A  hard  predicate  B(x)  for  f  is  one 
which  is  easy  to  compute  given  x  G  S  and  for  which  it  is  hard  to  compute  B(x)  given  only  f(x)  G  T . 

The  way  one  proves  a  predicate  is  a  hard  predicate,  assuming  /  is  a  one-way  function,  is  to  assume 
we  are  given  an  oracle  which  computes  B(x)  given  /(x),  and  then  show  that  this  oracle  can  be 
used  to  easily  invert  /. 

A  k- bit  predicate  and  hard  k- bit  predicate  are  defined  in  an  analogous  way  but  now  assuming 
the  codomain  of  B  is  the  set  of  bit  strings  of  length  k  rather  than  just  single  bits.  We  would  like 
to  show  that  various  predicates,  for  given  cryptographically  useful  functions  /,  are  in  fact  hard 
predicates. 


11.8.1.  Hard  Predicates  for  Discrete  Logarithms:  Let  G  denote  a  finite  abelian  group  of 
prime  order  q  and  let  g  be  a  generator.  Consider  the  predicate 


B2  :  x  i — >  x  (mod  2). 


We  can  show  the  following. 

Theorem  11.13.  The  predicate  B2  is  a  hard  predicate  for  the  function  x  i — >  gx . 


Proof.  Let  (D(h,g)  denote  an  oracle  which  returns  the  least  significant  bit  of  the  discrete  logarithm 
of  h  to  the  base  <7,  i.e.  it  computes  B2(x)  for  x  =  logg  h.  We  need  to  show  how  to  use  O  to  solve 
a  discrete  logarithm  problem.  Suppose  we  are  given  h  =  gx]  we  perform  the  following  steps.  First 
we  let  t  =  \  (mod  q),  then  we  set  y  =  0,  2  =  1  and  compute  until  h  —  1  the  following  steps: 

•  b  =  0(h,  g). 

•  If  b  =  1  then  y  =  y  +  z  and  h  —  h/g. 

•  Set  h  =  h1  and  z  =  2  •  z. 

We  then  output  y  as  the  discrete  logarithm  of  h  with  respect  to  g.  □ 

To  see  the  algorithm  in  the  proof  work  in  practice  consider  the  held  Feo7  and  the  element  g  =  64 
of  order  q  =  101.  We  wish  to  find  the  discrete  logarithm  of  h  =  56  with  respect  to  g.  Using  the 
algorithm  in  the  above  proof  we  compute  the  following  table,  and  hence  deduce  that  x  equals  86. 
One  can  indeed  then  check  that  g 86  =  h  (mod  p). 
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h 

0(h,g) 

z 

y 

56 

0 

1 

0 

451 
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2 

2 

201 

1 

4 

6 

288 

0 

8 

6 

100 

1 

16 

22 

454 

0 

32 

22 

64 

1 

64 

86 

11.8.2.  Hard  Predicates  for  the  RSA  Problem:  The  RSA  problem,  namely  given  c  =  me 
(mod  N )  has  the  following  three  hard  predicates: 


•  B\  (m)  =  m  (mod  2). 

•  Bh\m)  =  0  if  m  <  N/2  otherwise  Bh(m)  =  1. 

•  Bk(m)  =  nn  (mod  2k)  where  k  =  O (log (log N)). 


We  denote  the  corresponding  oracles  by  0i(c,  iV),  C\(c,  N )  and  0fc(c,  N).  We  do  not  deal  with  the 
last  of  these  but  we  note  that  the  first  two  are  related  since 


Oh(c,  N )  =  Oi(c  •  2e  (mod  N),  TV), 
Oi(c,  iV)  =  (9^(c  •  2_e  (mod  A),  A). 


We  then  have,  given  an  oracle  for  Oh  or  Oi,  that  we  can  invert  the  RSA  function  using  the  following 
algorithm,  which  is  based  on  the  standard  binary  search  algorithm.  We  let  y  =  c,  l  =  0  and  h  =  N, 
then,  while  h  —  l  >  1,  we  perform 

.  b  =  Oh(y,  N), 

•  y  =  y  •  2e  (mod  A), 

•  m=  (h  +  0/2, 

•  If  b  =  1  then  set  l  =  m,  otherwise  set  h  =  m. 


On  exiting  the  above  loop  the  value  of  [h\  should  be  the  preimage  of  c  under  the  RSA  function. 

As  an  example  suppose  we  have  N  =  10  403  and  e  =  7  as  the  public  information  and  we  wish 
to  invert  the  RSA  function  for  the  ciphertext  c  —  3  using  the  oracle  Oh(y,N) 
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So  the  preimage  of  3  under  the  RSA  function  x  \ — »  x7  (mod  10403)  is  4672. 

11.9.  Computational  Models:  The  Random  Oracle  Model 

The  final  item  we  need  to  discuss,  before  we  can  move  on,  is  the  computational  model  in  which 
we  work.  When  we  discussed  all  the  definitions  above  we  assumed  something  called  “the  standard 
model” ;  this  corresponds  to  the  “real  world”  in  some  sense.  However,  there  is  another  model  used 
in  cryptography  called  the  Random  Oracle  Model,  or  ROM. 

To  understand  this,  recall  our  game  for  PRF  security.  Recall  that  we  needed  to  consider  function 
families  since  otherwise  it  is  trivial  for  an  adversary  to  tell  whether  it  is  playing  against  the  real 
function  or  the  random  function.  If  it  knows  the  function  precisely  then  it  can  call  the  function 
itself  and  see  whether  the  result  of  the  oracle  call  is  real  or  random.  This  works  for  keyed  functions 
like  PRF  families,  but  there  are  many  functions  in  cryptography  which  are  fixed  yet  which  we  think 
behave  like  random  functions. 

In  the  random  oracle  model  we  provide  all  parties,  the  challenger  and  the  adversary,  with  a 
random  function  with  domain  {0, 1}*  and  a  finite  codomain  C.  The  challenger  has  control  of  the 
random  function,  whereas  the  adversary  can  just  call  it.  In  particular  the  challenger  can  make  up 
the  random  function  “on  the  fly” ,  as  long  as  the  adversary  cannot  tell  the  difference  between  this 
function  and  a  random  function.  Thus  an  adversary  in  the  ROM  has  available  to  it  an  oracle  Oh 
as  in  Figure  11.22 

Such  a  function  cannot  exist  in  the  real  world,  as  it  would  take  infinite  space  to  represent. 
However,  there  are  unkeyed  cryptographic  functions,  called  hash  functions,  which  are  designed 
to  behave  like  random  functions.  So  in  a  security  proof  we  may  assume  that  the  real  unkeyed 
cryptographic  hash  function  is  a  truly  random  function,  then  when  we  build  the  scheme  we  replace 
the  truly  random  function  by  the  real  unkeyed  cryptographic  hash  function. 

What  does  this  mean  in  practice?  We  can  think  of  a  proof  in  the  ROM  as  a  proof  in  which  the 
adversary  makes  no  use  of  knowledge  of  the  real  unkeyed  cryptographic  hash  function.  In  other 
words  from  the  adversary’s  point  of  view  the  function  is  just  a  random  function.  Thus  a  proof 
in  the  ROM  rules  out  attacks  where  the  adversary  makes  no  use  of  the  specifics  of  the  unkeyed 
cryptographic  hash  function.  Of  course  such  a  proof  does  not  rule  out  an  attack  which  uses  specific 
properties  (for  example  the  code)  of  the  unkeyed  cryptographic  hash  function. 
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£  =  {} 

If  (x,  y')  G  C  then  y  —  y’ 
else  y  <—  C 
C  =  CU  (x,  y) 

y 


Figure  11.22.  An  adversary  with  access  to  a  random  oracle 


The  use  of  the  ROM  allows  us  to  prove  secure  various  schemes  used  in  the  real  world.  Admit¬ 
tedly,  it  is  a  kind  of  cheat,  but  a  very  successful  one.  We  shall  return  to  this  and  give  examples  in 
later  chapters,  so  do  not  worry  at  this  stage  if  this  looks  like  an  added  complication.  It  actually 
makes  many  things  much  simpler. 


Chapter  Summary 


•  We  gave  definitions  of  security  of  pseudo-random  functions,  pseudo-random  permutations, 
one-way  and  trapdoor  one-way  functions. 

•  We  gave  the  definition  of  the  advantage  of  an  adversary. 

•  The  definition  of  what  it  means  for  a  scheme  to  be  secure  can  be  different  from  one’s  initial 
naive  view. 

•  Today  the  notion  of  semantic  security  is  the  de  facto  standard  definition  for  encryption 
schemes. 

•  Semantic  security  is  hard  to  prove  but  it  is  closely  related  to  the  simpler  notion  of  poly¬ 
nomial  security,  often  called  indistinguishability  of  encryptions. 

•  We  also  need  to  worry  about  the  capabilities  of  the  adversary.  For  encryption  this  is  divided 
into  three  categories:  passive  attacks,  chosen  plaintext  attacks,  and  chosen  ciphertext 
attacks. 

•  There  are  many  other  notions  of  encryption  security,  all  of  which  can  be  related  to  each 
other. 

•  Encryption  security  against  adaptive  adversaries  and  the  notion  of  non-malleability  are 
closely  related. 

•  Similar  considerations  apply  to  the  security  of  signature  schemes,  where  we  are  now  inter¬ 
ested  in  the  notion  of  existential  unforgeability  under  an  active  attack. 

•  The  Random  Oracle  Model,  or  ROM,  is  a  way  of  creating  an  idealized  random  function 
to  help  in  security  analysis. 


Further  Reading 

A  good  introduction  to  the  definitional  work  in  cryptography  based  on  provable  security  and  its 
extensions  and  foundations  in  the  idea  of  zero-knowledge  proofs  can  be  found  in  the  book  by  Gol- 
dreich.  A  survey  of  the  initial  work  in  this  held,  up  to  around  1990,  can  be  found  in  the  article  by 
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Goldwasser.  The  book  by  Katz  and  Lindell  takes  the  approach  of  definitions  and  security  proofs 
as  its  starting  point.  It  is  thus  a  good  book  to  follow  up  your  reading  of  this  one. 

O.  Goldreich.  Modern  Cryptography,  Probabilistic  Proofs  and  Pseudorandomness.  Springer,  1999. 

S.  Goldwasser.  The  Search  for  Provably  Secure  Cryptosystems.  In  Cryptology  and  Computational 
Number  Theory,  Proc.  Symposia  in  Applied  Maths,  Volume  42,  AMS,  1990. 

J.  Katz  and  Y.  Lindell.  Introduction  to  Modern  Cryptography:  Principles  and  Protocols.  CRC 
Press,  2007. 


CHAPTER  12 


Modern  Stream  Ciphers 


Chapter  Goals 

•  To  understand  the  basic  principles  of  modern  symmetric  ciphers. 

•  To  explain  the  workings  of  a  modern  stream  cipher. 

•  To  investigate  the  properties  of  linear  feedback  shift  registers  (LFSRs). 

•  To  explain  how  to  introduce  non-linearity  into  a  stream  cipher. 

12.1.  Stream  Ciphers  from  Pseudo-random  Functions 

We  can  interpret  a  pseudo-random  function  as  a  stream  cipher.  Let  {F^k  be  a  PRF  family  with 
codomain  C  of  bitstrings  of  length  £.  The  PRF  family  immediately  defines  a  stream  cipher  for 
messages  of  length  £  bits.  We  encrypt  a  message  by  setting 

c  =  m  ©  Ffc(O). 

We  then  want  to  show  that  this  scheme  is  IND-PASS  if  the  underlying  PRF  is  secure.  This  result  is 
given  in  the  next  theorem. 

Theorem  12.1.  If  {F^}k  is  a  PRF  family  outputting  strings  of  length  £  bits  then  the  stream  cipher 
II  given  by  c  =  m  ©  Fj~(0)  for  m  G  P  =  {0, 1 Y  is  IND-PASS.  In  particular 

Ad v[Tpass(A)  <  2  •  Ad v™y(A) 

Proof.  We  take  the  adversary  A  and  consider  the  game  it  is  playing,  as  depicted  in  Figure  12.1. 
We  then  change  the  game  slightly,  by  performing  a  so-called  “game  hop”.  This  hop  consists  of 
replacing  the  real  PRF  function  by  a  completely  random  function;  see  Figure  12.2.  We  call  the 
first  game  Co  and  the  second  game  G\.  We  stress  that  the  adversary  (i.e.  the  algorithm  that  the 
adversary  runs)  in  both  games  is  the  same;  we  are  just  changing  the  rules  of  the  game. 


k  KeyGenPRF() 
b<r-  {0,1} 

V  - - 

Win  if  b'  =  b 


mo,  nil  F  P 

G|r 

A 

C*  i —  77^5  ©  Ffc(0) 


Figure  12.1.  Security  game  Go  for  the  scheme  c  m  0  F ^(fS) 


Now  let  6q  denote  the  bit  returned  by  the  adversary  in  game  Go  and  b\  denote  the  bit  returned  by 
the  adversary  in  game  Gi .  Similarly  let  bo  and  b\  denote  the  bits  chosen  by  the  challenger  in  games 
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k  <—  KeyGenP  rf() 

b<r-  {0,1} 

b'  - - 

Win  if  b'  =  b 


Olr 


^  c*  C-  rriij  0  Rand( 0) 


Figure  12.2.  “Hopped”  security  game  G\  for  the  scheme  c  <—  m  0  Rand( 0) 

Go  and  Gi  respectively.  We  have  the  following  relationships  between  the  various  probabilities: 


(14) 

and 

(15) 


Pr[6o  =  1|60  =  1]  -  Pr[6i  =  l|6i  =  1]  =  AdvFU  (A 


Pr[£>o  =  0|6q  =  0]  —  Pr  [b[  =  0|6i  =  0] 


Ad00G) 


i.e.  for  fixed  b  in  both  games  the  difference  in  the  winning  probabilities  between  the  two  games  is 
the  same  as  the  advantage  in  distinguishing  a  member  of  a  PRF  family  from  a  random  function. 
Also  note  that 


(16) 


Pr^  =  l\b\  =  1]  —  Pr[6/1  =  0|6i  =  0]  =  0 


since  if  we  have  a  random  function  then  an  “encryption”  of  mo  is  a  random  string,  as  is  an 
“encryption”  of  mi;  this  is  essentially  the  security  of  the  one-time  pad.  Thus  the  probability  of  the 
adversary  winning  in  game  Gi  is  equal  to  1/2.  Putting  this  together  we  have 


Advn  D_PASS 


(A)  =  Pi[b'0  =  1|60  =  1]  -  Pr[b'0  =  l\bo  =  0] 


by  definition 


P^O  = 


l|6o  =  1] 

-  (Pr[&; 

-  Pr[&o  = 


=  l\bi  =  1]  —  Pr^  =  l|&i  =  1]) 

=  l|fto  =  0]' 


adding  zero 


< 


Pr[&o  =  1|5q  =  1]  —  Pr^  =  l\b\  =  1] 


0 


Pr  [b[  =  1 1 6i  =  1]  —  Pr[5g  =  1 1  =  0] 


triangle  inequality 


<  Ad v{i?fc}K(A)  0  Pr[5/1  =  l|5i  =  1]  —  Pr[5g  =  l|5o  =  0]  by  equation  (14) 

=  Adv00(A)  +  Pr[&i  =  l\h  =  1]  -  Pr[6o  =  l\b0  =  0] 

—  (Pr[5/1  =  0\bi  =  0]  —  Pr[5/1  =  0|6i  =  0]) 

<  Advpppj.K(A)  0  Pr [b[  =  l|6i  =  1]  —  Pr^  =  0|&i  =  0] 


adding  zero  again 


0 


Pr^  =  0|&i  =  0]  —  Pr[5o  =  0 1  =  0] 


=  Adv{Ff}*r(A)  +  Pr[fe,l  =  °Dl  =  °]  “  PrK  =  0|&0  =  0] 

<2-Adv00  A) 


triangle  inequality 

by  equation  (16) 
by  equation  (15). 
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So  when  defining  a  stream  cipher  we  want  to  look  for  a  candidate  which  could  possibly  be  a  pseudo¬ 
random  function.  In  this  chapter  we  will  look  at  various  practical  constructions  of  functions  which 
output  what  looks  like  random  data. 


12.2.  Linear  Feedback  Shift  Registers 

A  standard  way  of  producing  a  binary  stream  of  data  is  to  use  a  feedback  shift  register.  These  are 
small  circuits  containing  a  number  of  memory  cells,  each  of  which  holds  one  bit  of  information.  The 
set  of  such  cells  forms  a  register.  In  each  cycle  a  certain  predefined  set  of  cells  are  “tapped”  and 
their  value  is  passed  through  a  function,  called  the  feedback  function.  The  register  is  then  shifted 
down  by  one  bit,  with  the  output  bit  of  the  feedback  shift  register  being  the  bit  that  is  shifted  out 
of  the  register.  The  combination  of  the  tapped  bits  is  then  fed  into  the  empty  cell  at  the  top  of  the 
register.  Compare  this  to  how  we  modelled  the  Lorenz  cipher  wheels  in  Chapter  10:  the  difference 
is  that  the  output  bit  is  replaced  by  a  new  bit  which  depends  on  other  bits  within  the  register. 
This  is  explained  in  Figure  12.3. 


SL-l 

SL- 2 

SL- 3 

.  .  . 

52 

51 

50 

1 

r  i 

r  i 

r  i 

r  i 

r  i 

r 

Feedback  function 

Figure  12.3.  Feedback  shift  register 


It  is  desirable,  for  reasons  we  shall  see  later,  to  use  some  form  of  non-linear  function  as  the 
feedback  function.  However,  this  is  often  hard  to  do  in  practice,  hence  usually  one  uses  a  linear 
feedback  shift  register,  or  LFSR  for  short,  where  the  feedback  function  is  a  linear  function  of  the 
tapped  bits.  In  each  cycle  a  certain  predefined  set  of  cells  are  “tapped”  and  their  value  is  exclusive- 
or’ed  together.  The  register  is  then  shifted  down  by  one  bit,  with  the  output  bit  of  the  LFSR  being 
the  bit  that  is  shifted  out  of  the  register.  Again,  the  combination  of  the  tapped  bits  is  then  fed 
into  the  empty  cell  at  the  top  of  the  register. 

Mathematically  this  can  be  defined  as  follows,  where  the  register  is  assumed  to  be  of  length 
L.  One  defines  a  set  of  bits  [ci, . . . ,  cl]  which  are  set  to  one  if  that  cell  is  tapped  and  set  to  zero 
otherwise.  The  initial  internal  state  of  the  register  is  given  by  the  bit  sequence  [sl-u  •  •  •  ,  M,  5o  . 
The  output  sequence  is  then  defined  to  be  sq,  si,  52, . . . ,  sl-i,  sl,  sl+i,  •  •  •  where  for  j  >  L  we  have 


Sj  =  Cl  •  Sj—i  0  C2  •  Sj- 2  0  •  •  •  0  Cl  •  Sj-L- 

Note  that  for  an  initial  state  of  all  zeros  the  output  sequence  will  be  the  zero  sequence,  but  for  a 
non-zero  initial  state  the  output  sequence  must  eventually  be  periodic  (since  we  must  eventually 
return  to  a  state  we  have  already  been  in).  The  period  of  a  sequence  is  defined  to  be  the  smallest 
integer  N  such  that 


SN-\-i 

for  all  sufficiently  large  i.  In  fact  there  are  2L  —  1  possible  non-zero  states  and  so  the  most  one  can 
hope  for  is  that  an  LFSR,  for  all  non-zero  initial  states,  produces  an  output  stream  whose  period 
is  exactly  2L  —  1. 
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Each  state  of  the  linear  feedback  shift  register  can  be  obtained  from  the  previous  state  via  a 
matrix  multiplication.  If  we  write 

0  ...  0  ^ 

1  ...  0 


/  0 
0 


M 


1 

0 


0  0  0 

\  cl  cL- i  cl- 2 


and 


1 

ci  / 


v  —  (1,  0,  0, . . . ,  0) 

and  we  write  the  internal  state  as 

5  =  (si,  S2,  •  •  •  ,  Sl) 

then  the  next  state  can  be  deduced  by  computing 

s  i —  M  •  s 

and  the  output  bit  can  be  produced  by  computing  the  vector  product 


v  •  s. 

The  properties  of  the  output  sequence  are  closely  tied  up  with  the  properties  of  the  binary  poly¬ 
nomial 

C (X)  =  1  +  Cl  •  X  +  c2  •  X2  +  •  •  •  +  cL  •  XL  e  F2  [X], 

called  the  connection  polynomial  for  the  LFSR.  The  connection  polynomial  and  the  matrix  are 
related  via 

C(X)  =  det(X  ■  M  -  IL). 

In  some  textbooks  the  connection  polynomial  is  written  in  reverse,  i.e.  they  use 

G(X)  =XL-C(  1/X) 

as  the  connection  polynomial.  One  should  note  that  in  this  case  G(X)  is  the  characteristic  poly¬ 
nomial  of  the  matrix  M . 

As  examples  see  Figure  12.4  for  an  LFSR  in  which  the  connection  polynomial  is  given  by 
X3+A  +  l  and  Figure  12.5  for  an  LFSR  in  which  the  connection  polynomial  is  given  by  A32+X3  +  l. 


52 

51 

50 

I 

◄ - 

Figure  12.4.  Linear  feedback  shift  register:  X3  +  X  +  1 
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Figure  12.5.  Linear  feedback  shift  register:  X32  +  X3  +  1 


Of  particular  importance  is  the  case  in  which  the  connection  polynomial  is  primitive. 
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Definition  12.2.  A  binary  polynomial  C(X)  of  degree  n  is  primitive  if  it  is  irreducible  and  a 
root  9  of  C(X)  generates  the  multiplicative  group  of  the  field  ¥2^.  In  other  words,  since  C(X)  is 
irreducible  we  already  have 

F2[X]/(C(X))  =¥2(0)  =F2n, 

but  we  also  require  F \n  —  (9) . 

The  properties  of  the  output  sequence  of  the  LFSR  can  then  be  deduced  from  the  following  cases. 

•  cl  =  0:  i.e.  the  register  is  longer  than  the  degree  of  the  connection  polynomial. 

In  this  case  the  sequence  is  said  to  be  singular.  The  output  sequence  may  not  be  periodic, 
but  it  will  be  eventually  periodic. 

•  cL  =  1: 

Such  a  sequence  is  called  non-singular.  The  output  is  always  purely  periodic,  in  that 
it  satisfies  sw+i  =  si  f°r  all  *  rather  than  for  all  sufficiently  large  values  of  i.  Of  the 
non-singular  sequences  of  particular  interest  are  those  satisfying 

•  C(X)  is  irreducible: 

Every  non-zero  initial  state  will  produce  a  sequence  with  period  equal  to  the  smallest 
value  of  N  such  that  C(X)  divides  1  +  XN .  We  have  that  N  will  divide  2L  —  1. 

•  C(X)  is  primitive: 

Every  non-zero  initial  state  produces  an  output  sequence  which  is  periodic  and  of 
exact  period  2L  —  1. 

We  do  not  prove  these  results  here,  but  proofs  can  be  found  in  any  good  textbook  on  the  application 
of  finite  fields  to  coding  theory,  cryptography  or  communications  science.  However,  we  present  four 
examples  which  show  the  different  behaviours.  All  examples  are  on  Tbit  registers,  i.e.  L  —  4. 

Example  1:  In  this  example  we  use  an  LFSR  with  connection  polynomial  C(X)  =  X3  +  X  +  1. 
We  therefore  see  that  deg(C)  7^  L,  and  so  the  sequence  will  be  singular.  The  matrix  M  generating 
the  sequence  is  given  by 


0 

1 

0 

0  ^ 

0 

0 

1 

0 

0 

0 

0 

1 

V  0 

1 

0 

1 ) 

If  we  label  the  states  of  the  LFSR  by  the  number  whose  binary  representation  is  the  state  value, 
i.e.  so  =  (0,0, 0,0)  and  S5  =  (0, 1,0, 1),  then  the  periods  of  this  LFSR  can  be  represented  by  the 
transitions  in  Figure  12.6.  Note  that  it  is  not  purely  periodic. 


Example  2:  Now  let  the  connection  polynomial  C(X)  =  X4  -\-X3  +  X2  + 1  =  ( X  +  1)(X3  -\-X  + 1), 
which  corresponds  to  the  matrix 


0 

1 

0 

0  ^ 

0 

0 

1 

0 

0 

0 

0 

1 

VI 

1 

1 

0  / 

The  state  transitions  are  then  given  by  Figure  12.7.  Note  that  it  is  purely  periodic,  but  with  two 
different  cycle  lengths  due  to  the  different  factorization  properties  of  the  connection  polynomial 
modulo  2:  Two  cycles  of  length  7  =  23  —  1  corresponding  to  the  factor  of  degree  three,  and  one 
of  length  1  =  21  —  1  corresponding  to  the  factor  of  degree  one.  We  ignore  the  trivial  cycle  of  the 
zero’th  state. 
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Figure  12.6.  Transitions  of  the  Tbit  LFSR  with  connection  polynomial  X3  +  X  +  1 


Figure  12.7.  Transitions  of  the  Tbit  LFSR  with  connection  polynomial  X4-\-X3  -\- 
X2  +  l 


Example  3:  Now  take  the  connection  polynomial  C(X)  =  X4  +  X3  +  X2  +  X  +  1,  which  is 
irreducible,  but  not  primitive.  The  matrix  is  now  given  by 

/  0  1  0  0  \ 

0  0  10 

0  0  0  1 

\  1  1  1  1  / 
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The  state  transitions  are  then  given  by  Figure  12.8.  Note  that  it  is  purely  periodic  and  all  cycles 
have  the  same  length,  bar  the  trivial  one. 


Figure  12.8.  Transitions  of  the  Tbit  LFSR  with  connection  polynomial  A4  +  X3  + 

X2  +  X  +  l 

Example  4:  As  our  final  example  we  take  the  connection  polynomial  C(X)  =  X4  +  X  +  1,  which 
is  irreducible  and  primitive.  The  matrix  M  is  now 

/  0  1  0  0  \ 

0  0  10 

0  0  0  1 

\  1  0  0  1  / 

and  the  state  transitions  are  given  by  Figure  12.9. 

Whilst  there  are  algorithms  to  generate  primitive  polynomials  for  use  in  applications  we  shall 
not  describe  them  here.  The  following  list  gives  some  examples,  all  with  a  small  number  of  taps 
for  efficiency. 


X31  +  X3  +  1, 

X31  +  X6  +  1, 

X31  +  X7  +  1, 

x39  +  x4  +  1, 

x60  +  X  +  1, 

x63  +  X  +  1, 

X71  +  X6  +  1, 

X93  +  x2  +  1, 

x137  +  x21  +  1, 

x145  +  x52  +  1, 

x161  +  x18  +  1, 

x521  +  x32  +  1. 

Although  LFSRs  efficiently  produce  bit  streams  from  a  small  key,  especially  when  implemented  in 
hardware,  they  are  not  usable  on  their  own  for  cryptographic  purposes.  This  is  because  they  are 
essentially  linear,  which  is  after  all  why  they  are  efficient. 

We  shall  now  show  that  if  we  know  an  LFSR  to  have  L  internal  registers  and  we  can  determine 
2  •  L  consecutive  bits  of  the  stream  then  we  can  determine  the  whole  stream.  First  notice  that 
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Figure  12.9.  Transitions  of  the  Tbit  LFSR  with  connection  polynomial  X4  +  X  +  1 


we  need  to  determine  L  unknowns:  the  L  values  of  the  “taps”  q,  since  the  L  values  of  the  initial 
state  so, . . . ,  sl-i  are  given  f°  us-  This  type  of  data  could  be  available  in  a  known  plaintext  attack, 
where  we  obtain  the  ciphertext  corresponding  to  a  known  piece  of  plaintext;  since  the  encryption 
operation  is  simply  exclusive-or  we  can  determine  as  many  bits  of  the  keystream  as  we  require. 
Using  the  equation 

L 

Sj  =  22Ci'  sj~i  (mod  2)’ 
i=  1 

we  obtain  2  •  L  linear  equations,  which  we  then  solve  via  standard  matrix  techniques.  We  write  our 
matrix  equation  as 


(  Sl- 1  Sl- 2 

sl  sl- i 


si 

S2 


So  \ 


Sl 


S2L-3  S2L-4  •  •  •  SL-1  Sl~  2 

\  S2L-2  S2L-3  •  •  •  Sl  Sl~  1  ) 

As  an  example,  suppose  we  see  the  output  sequence 


(  \ 


C2 


cl-  1 

V  CL 


{  SL  \ 

SL+1 

S2L-2 

V  S2L-1  J 


1,1, 1,1,0, 1,0, 1,1,0, 0,1, 0,0,0,... 


and  we  are  told  that  this  sequence  was  the  output  of  a  four-bit  LFSR.  Using  the  above  matrix 
equation,  and  solving  it  modulo  2,  we  would  find  that  the  connection  polynomial  was  given  by 

X4  +  X  +  l. 


Hence,  if  we  use  an  LFSR  of  size  L  to  generate  a  keystream  for  a  stream  cipher  and  the  adversary 
obtains  at  least  2  •  L  bits  of  this  keystream  then  she  can  determine  the  exact  LFSR  used  and  so 
generate  as  much  of  the  keystream  as  she  wishes.  Therefore,  we  would  like  to  be  able  to  adapt 
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the  use  of  LFSRs  in  some  non-linear  way,  which  hides  their  linearity  in  order  to  produce  output 
sequences  with  high  linear  complexity.  We  can  conclude  that  a  stream  cipher  based  solely  on  a 
single  LFSR  is  insecure  against  a  known  plaintext  attack. 

12.2.1.  Linear  Complexity:  An  important  measure  of  the  cryptographic  quality  of  a  sequence 
is  given  by  the  linear  complexity  of  the  sequence. 

Definition  12.3  (Linear  complexity).  For  an  infinite  binary  sequence 

S  =  So,  Si,  52,  53,  ...  , 

we  define  the  linear  complexity  of  s  as  L(s)  where 

•  L(s)  =  0  if  s  is  the  zero  sequence, 

•  L(s )  =  oo  if  no  LFSR  generates  s, 

•  L(s)  is  the  length  of  the  shortest  LFSR  to  generate  s,  otherwise. 

Since  we  cannot  compute  the  linear  complexity  of  an  infinite  set  of  bits  we  often  restrict  ourselves 
to  a  finite  set  sn  of  the  first  n  bits.  The  linear  complexity  satisfies  the  following  properties  for  any 
sequence  5. 

•  For  all  n  >  1  we  have  0  <  L(sn)  <  n. 

•  If  s  is  periodic  with  period  N  then  L(s)  <  N. 

•  L(s  ®  t)  ©  L(s)  T  L(t). 

For  a  random  sequence  of  bits,  which  is  what  we  want  from  a  stream  cipher’s  keystream  generator, 
we  should  have  that  the  expected  linear  complexity  of  sn  is  approximately  just  larger  than  n/ 2. 
But  for  a  keystream  generated  by  an  LFSR  we  know  that  we  will  have  L(sn)  =  L  for  all  n  >  L. 
Hence,  an  LFSR  produces  nothing  at  all  like  a  random  bit  string.  After  all  it  is  produced  by  a 
linear  function! 

We  have  seen  that  if  we  know  the  length  of  the  LFSR  then,  from  the  output  bits,  we  can 
generate  the  connection  polynomial.  To  determine  the  length  we  use  the  linear  complexity  profile, 
which  is  defined  to  be  the  sequence  ©(s1),  L(s2),  L(s3), . . ..  There  is  also  an  efficient  algorithm 
called  the  Berlekamp-Massey  algorithm  which  given  a  finite  sequence  sn  will  compute  the  linear 
complexity  profile 

L(s1),L(s2),L(s3),...,L(sn). 

In  addition  the  Berlekamp-Massey  algorithm  will  also  output  the  associated  connection  polynomial, 
if  n  >  L(sn) / 2,  using  a  technique  more  efficient  than  the  prior  matrix  technique. 

12.3.  Combining  LFSRs 

To  obtain  greater  security  a  common  practice  is  to  use  a  number,  say  n,  of  LFSRs,  each  producing 

a  different  output  sequence  The  key  is  then  the  initial  state  of  all  of  the  LFSRs 

and  the  keystream  is  produced  from  these  n  generators  using  a  non-linear  combination  function 
f(x\, . . .  ,  £n),  as  described  in  Figure  12.10. 

We  begin  by  examining  the  case  where  the  combination  function  is  a  Boolean  function  of  the 
output  bits  of  the  constituent  LFSRs.  For  analysis  of  this  function  we  write  it  as  a  sum  of  distinct 
products  of  variables,  e.g. 

f(x  1,  £2,  Xs,X4,  £5)  =  1  ©  £2  ®  £3  ©  (£4  •  £5)  ©  {x\  •  £2  •  £3  •  X5). 

However,  in  practice  the  Boolean  function  could  be  implemented  in  a  different  way.  When  expressed 
as  a  sum  of  products  of  variables  we  say  that  the  Boolean  function  is  in  algebraic  normal  form. 

Suppose  that  one  uses  n  LFSRs  of  maximal  length  (i.e.  all  with  a  primitive  connection  polyno¬ 
mial)  and  whose  periods  L\, . . . ,  Ln  are  all  distinct  and  greater  than  two.  Then,  an  amazing  fact 
is  that  the  linear  complexity  of  the  keystream  generated  by  f(pc  1, . . .  ,xn)  is  equal  to 

/ {L 1 , . . . ,  Tn) 
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Figure  12.10.  Combining  LFSRs 

where  we  replace  0  in  /  with  integer  addition  and  multiplication  modulo  two  by  integer  multipli¬ 
cation,  assuming  /  is  expressed  in  algebraic  normal  form.  The  non-linear  order  of  the  polynomial 
/  is  then  defined  to  be  equal  to  the  total  degree  of  f1. 

However,  it  turns  out  that  creating  a  non-linear  function  which  results  in  a  high  linear  com¬ 
plexity  is  not  the  whole  story.  For  example,  consider  the  stream  cipher  produced  by  the  Geffe 
generator.  This  generator  takes  three  LFSRs  of  maximal  period  and  distinct  sizes,  Li,Z/2  and  L3, 
and  then  combines  them  using  the  following  second-order  non-linear  function, 

(17)  2  =  f(x  1,  X2,  Xs)  =  (xi  *  X2)  ©  (X2  •  £3)  ©  X3. 

This  would  appear  to  have  very  nice  properties:  its  linear  complexity  is  given  by 

L\  •  1/2  +  L2  •  T3  +  L3 

and  its  period  is  given  by 

(2Ll  -  1)(2L2  -  1)(2Ls  -  1). 

However,  it  turns  out  to  be  cryptographically  weak.  To  understand  the  weakness  of  the  Geffe 
generator  consider  the  following  table,  which  presents  the  outputs  X{  of  the  constituent  LFSRs  and 
the  resulting  output  z  of  the  Geffe  generator 


X\ 

X2 

x3 

z 

0 

0 

0 

0 

0 

0 

1 

1 

0 

1 

0 

0 

0 

1 

1 

0 

1 

0 

0 

0 

1 

0 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 

If  the  Geffe  generator  was  using  a  “good”  non-linear  combining  function  then  the  output  bits  z 
would  not  reveal  any  information  about  the  corresponding  output  bits  of  the  constituent  LFSRs. 
However,  we  can  easily  see  that 

Pr(z  =  x\)  =  3/4  and  Pi(z  =  X3)  =  3/4. 

This  means  that  the  output  bits  of  the  Geffe  generator  are  correlated  with  the  bits  of  two  of  the 
constituent  LFSRs.  Hence,  we  can  attack  the  generator  using  a  correlation  attack,  as  follows. 
Suppose  we  know  the  lengths  Li  of  the  constituent  generators,  but  not  the  connection  polynomials 
or  their  initial  states.  The  attack  is  desribed  in  Algorithm  12.1. 

^The  total  degree  of  a  polynomial  in  n  variables  is  the  maximum  sum  of  the  degrees  in  each  monomial  term. 
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Algorithm  12.1:  Correlation  attack  on  the  Geffe  generator 

for  all  primitive  connection  polynomials  of  degree  L\  do 
for  all  initial  states  of  the  first  LFSR  do 

Compute  2  •  L\  bits  of  output  of  the  first  LFSR. 

Compute  how  many  are  equal  to  the  output  of  the  Geffe  generator. 

A  large  value  signals  that  this  is  the  correct  choice  of  generator  and  starting  state. 

Repeat  the  above  for  the  third  LFSR. 

Recover  the  second  LFSR  by  testing  possible  values  using  equation  (17). 


It  turns  out  that  there  are  a  total  of 

S  =  <p(2Ll  -  1)  ■  0(2i2  -  1)  ■  4>(2Ls  -  l)/(Li  •  L2  ■  L3) 

possible  connection  polynomials  for  the  three  LFSRs  in  the  Geffe  generator.  The  total  number  of 
initial  states  of  the  Geffe  generator  is 

T  =  (2Ll  -  1)(2L2  -  l)(2Ls  -  1)  «  2Ll+L2+Ls. 

This  means  that  the  key  size  of  the  Geffe  generator  is 

S-T&S-  (2Ll+L2+L3). 

For  a  secure  stream  cipher  we  would  like  the  size  of  the  key  space  to  be  about  the  same  as  the 
number  of  operations  needed  to  break  the  stream  cipher.  However,  the  above  correlation  attack  on 
the  Geffe  generator  requires  roughly 

S  ■{  2Ll  +2i2  +2Ls) 

operations.  The  reason  for  the  reduced  complexity  is  that  we  can  deal  with  each  constituent  LFSR 
in  turn. 

To  combine  high  linear  complexity  and  resistance  to  correlation  attacks  (and  other  attacks)  design¬ 
ers  have  had  to  be  a  little  more  ingenious  in  their  choice  of  non-linear  combiners  for  LFSRs.  We 
now  outline  a  small  subset  of  some  of  the  most  influential. 

12.3.1.  Filter  Generator:  The  basic  idea  here  is  to  take  a  single  primitive  LFSR  with  internal 
state  si, . . . ,  sl  and  then  make  the  output  of  the  stream  cipher  a  non-linear  function  of  the  whole 
state,  i.e.  z  =  F(s i, . . . ,  sl).  If  F  has  non-linear  order  m  then  the  linear  complexity  of  the  resulting 
sequence  is  given  by 


12.3.2.  Alternating-Step  Generator:  This  takes  three  LFSRs  of  size  L i,  L 2  and  L3  which  are 
pairwise  coprime  and  of  roughly  the  same  size.  Denote  the  output  sequence  of  the  three  LFSRs  by 
X\ ,  X2  and  £3.  The  first  LFSR  is  clocked  on  every  iteration;  if  its  output  x\  is  equal  to  one,  then 
the  second  LFSR  is  clocked  and  the  output  of  the  third  LFSR  is  repeated  from  its  last  value.  If  the 
output  of  x\  is  equal  to  zero,  then  the  third  LFSR  is  clocked  and  the  output  of  the  second  LFSR 
is  repeated  from  its  last  value.  The  output  of  the  generator  is  the  value  of  £2  ®  #3.  This  operation 
is  described  graphically  in  Figure  12.11,  where  (as  in  Chapter  10)  we  denote  a  clocking  signal  by  a 
black  dot  to  the  left  of  the  LFSR  which  is  being  clocked.  The  LFSR  will  clock  one  step  if  the  wire 
has  a  one  on  it,  and  will  otherwise  remain  in  its  current  state.  The  alternating-step  generator  has 
period 


2Ll  ■  (2L2  -  1)  •  (2Ls  -  1) 
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and  linear  complexity  approximately  (L2  +  L 3)  •  2Ll. 


Clock 


LFSR  1 


•  LFSR  2 


© 


>0- 


LFSR  3 


Figure  12.11.  Graphical  representation  of  the  alternating-step  generator 


12.3.3.  Shrinking  Generator:  Here  we  take  two  LFSRs  with  output  sequence  x\  and  X2,  and 
the  idea  is  to  throw  away  some  of  the  X2  stream  under  the  control  of  the  x\  stream.  Both  LFSRs 
are  clocked  at  the  same  time,  and  if  x\  is  equal  to  one  then  the  output  of  the  generator  is  the 
value  of  X2 •  If  x\  is  equal  to  zero  then  the  generator  just  clocks  again.  Note  that,  consequently 
the  generator  does  not  produce  a  bit  on  each  iteration.  This  operation  is  described  graphically  in 
Figure  12.12.  If  we  assume  that  the  two  constituent  LFSRs  have  size  L\  and  L2  with  gcd(Li,  L2) 
equal  to  one,  then  the  period  of  the  shrinking  generator  is  equal  to 

(2L2  -  1)  •  2Ll_1 

and  its  linear  complexity  is  approximately  L2  •  2Ll. 


— 

Clock 

— 

LFSR  2 


LFSR  1 


If  x\  —  1  then 
output  X2 , 
else 

output  nothing 


► 


Figure  12.12.  Graphical  representation  of  the  shrinking  generator 

12.3.4.  The  A5/1  Generator:  Probably  the  most  famous  of  the  LFSR-based  stream  ciphers  is 
A5/1.  This  was  the  stream  cipher  used  to  encrypt  the  on-air  traffic  in  the  second  generation  (a.k.a. 
GSM)  mobile  phone  networks  in  Europe  and  the  US.  It  was  developed  in  1987,  but  its  design  was 
kept  secret  until  1999  when  it  was  reverse  engineered.  There  is  a  weakened  version  of  the  algorithm 
called  A5/2  which  was  designed  for  use  in  places  to  which  there  were  various  export  restrictions. 
Various  attacks  have  been  published  on  A5/1  so  that  it  is  no  longer  considered  a  secure  cipher.  For 
example,  in  2006  it  was  shown  that  one  could  break  into  mobile  phone  conversations  which  had 
been  protected  with  A5/1  essentially  in  real  time.  In  the  replacement  for  GSM,  i.e.  UMTS  (a.k.a. 
3G  networks)  and  LTE  (a.k.a.  4G  networks),  the  A5/1  cipher  has  been  replaced  with  the  block 
cipher  KASUMI  applied  in  a  stream  cipher  mode  of  operation. 

The  stream  cipher  A5/1  makes  use  of  three  LFSRs  of  lengths  19,  22  and  23.  These  have 
characteristic  polynomials 

x18  +  x17  +  x16  +  x13  +  1, 

x21+x20  +  l, 

x22  +  x21  +  x20  +  x7  +  1. 
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Alternatively  (and  equivalently)  their  connection  polynomials  are  given  by 

x18  +  x5  +  X2  +  X1  +  1, 

x21  +  X1  +  1, 

x22  +  X15  +  X2  +  X1  +  1. 

The  output  of  the  cipher  is  the  exclusive-or  of  the  three  output  bits  of  the  three  LFSRs. 

To  clock  the  registers  we  associate  with  each  register  a  “clocking  bit” .  These  are  in  positions 
10,  11  and  12  of  the  LFSRs  (assuming  bits  are  ordered  with  0  corresponding  to  the  output  bit; 
other  books  may  use  a  different  ordering).  We  will  call  these  bits  c i,  C2  and  c3.  At  each  clock  step 
the  three  bits  are  computed  and  the  “majority  bit”  is  determined  via  the  formulae 

(ci  •  c2)  ©  (c2  •  c3)  ©  (ci  •  c3). 

The  ffh  LFSR  is  then  clocked  if  the  majority  bit  is  equal  to  the  bit  q.  Thus  clocking  occurs  subject 
to  the  following  table. 
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We  see  that  in  A5/1,  each  LFSR  is  clocked  with  probability  3/4.  This  operation  is  described 
graphically  in  Figure  12.13,  where  the  “gate”  given  by  an  equals  sign  is  the  equality-testing  gate, 
and  the  “gate”  labelled  by  “Maj”  is  the  majority  function  described  above. 


Figure  12.13.  Graphical  representation  of  the  A5/1  generator 
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12.3.5.  Trivium:  Trivium  is  a  relatively  recent  hardware  design  for  a  stream  cipher,  which  ap¬ 
pears  to  be  more  secure  than  previous  designs  based  on  shift  registers  (although  the  security  of 
Trivium  is  not  fully  guaranteed  at  this  point  in  time,  with  some  theoretical  attacks  on  it  having 
been  presented).  The  basis  of  Trivium  is  a  set  of  three  shift  registers  called  a,  b  and  c,  of  lengths 
93,  84  and  111  bits  respectively  (making  288  bits  in  total).  Once  the  state  has  been  set  up  the 
three  shift  registers  feed  into  each  other  via  the  following  equations,  over  F2: 

ai  =  Q- 111  +  Ci- 110  +  Q— 109  +  Q— 66  +  ai- 69? 

bi  =  ai— 93  +  ai— 92  +  ai— 91  +  0-i— 66  +  78? 

Ci  =  bi- 84  +  bi- 83  +  bi- 82  +  bi-Q9  +  7. 

Notice  the  regular  pattern  here:  the  three  top  bits  of  a,  b  or  c  are  combined  with  a  lower  bit  (in 

position  66  or  69)  and  then  with  a  bit  of  a  second  register,  to  obtain  a  new  bit  in  the  second  register. 

The  output  bit  of  Trivium  is  then  obtained  from  the  F2-equation 

ri  =  Ci—  in  +  ai- 93  +  bi- 84  +  Ci— 66  +  ai-QQ  +  6^_84. 

To  initialize  the  state  an  80-bit  key  ho, . . . ,  /C79  and  an  (up  to)  80-bit  initial  value  (IV)  no,  . . .  ,^79 
are  fed  into  the  lower  bits  of  the  a  and  b  registers,  with  a  getting  the  key,  and  b  the  IV.  The  rest  of 
the  bits  of  all  registers  are  set  to  zero,  bar  the  top  three  bits  of  the  c  register.  The  system  is  then 
clocked  4  •  288  =  1152  times  before  any  keystream  is  actually  used. 

Note  that  this  is  the  first  of  the  stream  ciphers  we  have  looked  at  which  explicitly  utilizes  an 
IV.  We  shall  see  I  Vs  being  used  in  the  next  chapter  on  block  ciphers,  but  the  basic  reason  for  using 
them  is  to  move  beyond  the  IND-PASS  security  of  Theorem  12.1.  The  IV  essentially  provides  a 
unique  input  to  the  keyed  PRF  that  we  are  trying  to  produce.  So  in  theoretical  terms  our  cipher 
becomes  c  =  nn  ®  F^(IV).  We  do  not  discuss  the  theoretical  implications  here,  since  much  of  the 
discussion  on  block  ciphers  in  the  next  chapter  will  be  directly  applicable  in  this  situation  as  well. 

12.4.  RC4 

RC  stands  for  Ron’s  Cipher  after  Ron  Rivest  of  MIT.  You  should  not  think  that  the  RC4  cipher  is 
a  prior  version  of  the  block  ciphers  RC5  and  RC6.  It  is  in  fact  a  very,  very  fast  stream  cipher.  It  is 
easy  to  remember  since  it  is  surprisingly  simple.  Up  until  quite  recently  it  was  widely  deployed  in 
browsers  to  secure  traffic  to  websites  using  the  TLS  protocol.  However,  recent  analysis  has  shown 
that  the  random  stream  produced  by  the  RC4  algorithm  does  not  behave  in  a  random  manner.  In 
particular,  each  output  byte  has  a  particular  bias.  What  is  surprising  is  that  the  recent  analysis  is 
relatively  straightforward  but  the  biases  had  not  been  discovered  in  over  twenty  years  of  use  of  the 
RC4  algorithm.  Now  that  the  vulnerability  is  known,  RC4  should  no  longer  be  used.  However,  we 
present  it  since  it  is  both  historically  important  and  elegantly  simple  in  design. 

To  describe  RC4  we  take  an  array  A,  indexed  from  0  to  255,  consisting  of  the  integers  0, . . . ,  255, 
permuted  in  some  key-dependent  way.  The  output  of  the  RC4  algorithm  is  a  keystream  of  bytes 
K  which  is  exclusive-or’ed  with  the  plaintext  byte  by  byte.  Since  the  algorithm  works  on  bytes 
and  not  bits  and  uses  very  simple  operations,  it  is  particularly  fast  in  software.  We  start  by  letting 
i  —  0  and  j  —  0.  We  then  repeat  the  steps  in  Algorithm  12.2. 


Algorithm  12.2:  RC4  algorithm 

i  <—  (i  +  1)  mod  256. 
j  ( j  +  Si)  mod  256. 

swap  (Si,  Sj). 
t  <—  (Si  +  Sj)  mod  256. 

K  <r-  St . 
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The  security  rests  on  the  observation  that  even  if  the  attacker  knows  K  and  z,  he  can  deduce 
the  value  of  St,  but  this  does  not  allow  him  to  deduce  anything  about  the  internal  state  of  the 
table.  This  follows  from  the  observation  that  he  cannot  deduce  the  value  of  £,  as  he  does  not  know 
j,Si  or  Sj.  It  is  a  very  tightly  designed  algorithm  as  each  line  of  the  code  needs  to  be  there  to 
make  the  cipher  immune  to  trivial  attacks: 

•  z  <0-  (z  +  1)  mod  256: 

Makes  sure  every  array  element  is  used  once  after  256  iterations. 

•  j  <—  (j  +  Si)  mod  256: 

Makes  the  output  depend  non-linearly  on  the  array. 

•  swap  (Si,  Sj): 

Makes  sure  the  array  is  evolved  and  modified  as  the  iteration  continues. 

•  t  (Si  +  Sj)  mod  256: 

Makes  sure  the  output  sequence  reveals  little  about  the  internal  state  of  the  array. 

The  initial  state  of  the  array  S  is  determined  from  the  key  using  Algorithm  12.3. 


Algorithm  12.3:  RC4  key  schedule 
for  i  =  0  to  255  do  Si  z. 

Initialize  for  i  —  0, . . . ,  255,  with  the  key,  repeating  if  neccesary. 
j  <0-  0. 

for  i  =  0  to  255  do 

j  (j  +  Si  +  Ki)  mod  256. 
swap  (Si,  Sj). 


Chapter  Summary 

•  Many  modern  stream  ciphers  can  be  obtained  by  combining,  in  a  non-linear  way,  simple 
bit  generators  called  LFSRs. 

•  LFSR-based  stream  ciphers  are  very  fast  ciphers,  suitable  for  implementation  in  hardware, 
to  encrypt  real-time  data  such  as  voice  or  video.  But  they  need  to  be  augmented  with  a 
method  to  produce  a  form  of  non-linear  output. 

•  RC4  provides  a  fast  and  compact  byte  oriented  stream  cipher  for  use  in  software,  but  it  is 
no  longer  considered  secure. 


Further  Reading 

A  good  introduction  to  linear  recurrence  sequences  over  finite  fields  is  in  the  book  by  Lidl  and 
Niederreiter.  This  book  covers  all  the  theory  one  requires,  including  examples  and  a  description  of 
the  Berlekamp-Massey  algorithm.  The  attacks  on  the  A5/1  algorithm  are  described  in  the  paper 
by  Barkan  et  al.  The  paper  by  AlFardan  et  al.  covers  recent  analysis  of  the  RC4  stream  cipher. 

N.J.  AlFardan,  D.J.  Bernstein,  K.G.  Paterson,  B.  Poettering  and  J.C.N.  Schuldt.  On  the  security 
of  RC4  in  TLS.  USENIX  Security  Symposium,  305-320,  USENIX  Association,  2013. 
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E.  Barkan,  E.  Biham  and  N.  Keller.  Instant  ciphertext- only  cryptanalysis  of  GSM  encrypted  com¬ 
munication.  Journal  of  Cryptology,  21,  391-429,  2008. 

R.  Lidl  and  H.  Niederreiter.  Introduction  to  Finite  Fields  and  Their  Applications.  Cambridge 
University  Press,  1986. 


CHAPTER  13 


Block  Ciphers  and  Modes  of  Operation 


Chapter  Goals 

•  To  introduce  the  notion  of  block  ciphers. 

•  To  understand  the  workings  of  the  DES  algorithm. 

•  To  understand  the  workings  of  the  AES  algorithm. 

•  To  learn  about  the  various  standard  modes  of  operation  of  block  ciphers. 

13.1.  Introduction  to  Block  Ciphers 

The  basic  description  of  a  block  cipher  is  shown  in  Figure  13.1.  Block  ciphers  operate  on  blocks 


Plaintext  block  m 

j 

Secret  key  k 


Ciphertext  block  c 


Cipher  function  e 


Figure  13.1.  Operation  of  a  block  cipher 

of  plaintext  one  at  a  time  to  produce  blocks  of  ciphertext.  The  block  of  plaintext  and  the  block 
of  ciphertext  are  assumed  to  be  of  the  same  size,  e.g.  a  block  of  n  bits.  Every  string  of  n  bits  in 
the  domain  should  map  to  a  string  of  n  bits  in  the  codomain,  and  every  string  of  n  bits  in  the 
codomain  should  result  from  the  application  of  the  function  to  a  string  in  the  domain.  This  means 
that  for  a  fixed  key  a  block  cipher  is  bijective  and  hence  is  a  permutation.  We  write 

c  ek{m), 

m  dk(c) 

where 

•  mn  G  {0,  l}n  is  the  plaintext  block, 

•  k  G  K  is  the  secret  key,  chosen  from  key  space  iC, 

•  e  is  the  encryption  function, 

•  d  is  the  decryption  function, 

•  c  G  {0,  l}b  is  the  ciphertext  block. 

The  block  sizes  taken  are  usually  reasonably  large,  64  bits  in  DES  and  128  bits  or  more  in  modern 
block  ciphers. 

In  terms  of  our  prior  definitions  from  Chapter  11,  a  block  cipher  should  “act  like”  a  family  of 
pseudo-random  permutations  (PRPs),  indexed  by  the  key  space  1C,  {Fj^k-  We  put  the  “act  like” 
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in  quotes,  as  for  all  existing  (efficient)  block  ciphers  used  in  practice  we  cannot  mathematically 
prove  that  they  are  PRPs.  Thus  we  can  only  “hope”  that  no  adversary  can  break  the  PRP  security 
game  (Figure  11.4  of  Chapter  11)  for  the  specific  block  cipher1.  In  particular  we  hope  that  the 
advantage  of  an  adversary  against  the  PRP  property  is  something  like 


Adv{T}, 


(4)  «  1/| K 


for  all  adversaries  A.  So  we  require  the  key  space  to  be  rather  large  to  ensure  this  advantage  is 
small.  But  note  that  just  because  a  block  cipher  has  a  large  key  space  does  not  mean  it  will  be  a 
secure  PRP. 

Despite  its  name  a  block  cipher  is  not  an  encryption  scheme;  it  is  a  building  block  to  create  an 
encryption  scheme.  The  term  for  how  one  creates  an  encryption  scheme  out  of  a  block  cipher  is  a 
mode  of  operation.  The  key  advantage  of  this  division,  between  a  mode  of  operation  and  a  block 
cipher  design,  is  that  we  can  design  our  modes  and  our  block  ciphers  independently.  As  remarked, 
the  design  goal  for  a  block  cipher  is  that  it  is  a  secure  pseudo-random  permutation,  whereas  the 
design  goal  of  a  mode  of  operation  is  one  of  our  security  goals,  such  as  IND-CCA.  A  designer  of 
a  mode  of  operation  tries  to  prove  mathematically  that  the  mode  satisfies  the  required  security 
definition  on  the  assumption  that  the  block  cipher  is  a  secure  PRP. 

There  are  many  block  ciphers  in  use  today,  some  which  you  may  find  used  in  your  web  browser; 
these  include  AES,  CAMELLIA,  DES  or  3DES.  The  most  famous  of  these  is  DES,  or  the  Data 
Encryption  Standard.  This  was  first  published  in  the  mid-1970s  as  a  US  Federal  standard  and  soon 
became  the  de  facto  international  standard  for  banking  applications. 

The  DES  algorithm  stood  up  remarkably  well  to  the  test  of  time,  but  in  the  early  1990s  it 
became  clear  that  a  new  standard  was  required.  This  was  because  both  the  block  length  (64 
bits)  and  the  key  length  (56  bits)  of  basic  DES  were  too  small  for  new  applications.  It  is  now 
possible  to  recover  a  56-bit  DES  key  using  either  a  network  of  computers  or  specialized  hardware 
for  relatively  little  cost.  Therefore  DES  has  been  phased  out  of  most  applications;  although  it  still 
exists  as  a  component  in  the  variant  called  triple  DES  (3DES).  In  response  to  the  problem  of  DES 
being  deemed  insecure,  the  US  National  Institute  of  Standards  and  Technology  (NIST)  initiated  a 
competition  to  find  a  new  block  cipher,  to  be  called  the  Advanced  Encryption  Standard  or  AES. 

Unlike  the  process  used  to  design  DES,  which  was  kept  essentially  secret,  the  design  of  the  AES 
was  performed  in  public.  A  number  of  groups  from  around  the  world  submitted  designs  for  the 
AES.  Eventually  five  algorithms,  known  as  the  AES  finalists,  were  chosen  to  be  studied  in  depth. 
These  were 

•  MARS  from  a  group  at  IBM, 

•  RC6  from  a  group  at  RSA  Security, 

•  Twohsh  from  a  group  based  at  Counterpane,  UC  Berkeley  and  elsewhere, 

•  Serpent  from  a  group  of  three  academics  based  in  Israel,  Norway  and  the  UK, 

•  Rijndael  from  a  couple  of  Belgian  cryptographers. 

Finally  in  the  fall  of  2000,  NIST  announced  that  the  overall  AES  winner  had  been  chosen  to  be 
Rijndael,  and  so  from  hence  forth  Rijndael  was  known  as  AES. 

DES  and  all  the  AES  finalists  are  examples  of  iterated  block  ciphers.  Block  ciphers  obtain  their 
security  by  repeated  use  of  a  simple  round  function.  The  round  function  takes  an  n-bit  block  and 
returns  an  n-bit  block,  where  n  is  the  block  size  of  the  overall  cipher.  The  number  of  rounds  r  can 
either  be  variable  or  fixed.  As  a  general  rule  increasing  the  number  of  rounds  will  increase  the  level 
of  security  of  the  block  cipher. 


1An  interesting  side  effect  of  our  constructions  is  that  we  make  no  assumption  on  whether  it  is  hard  to  break 
the  block  cipher  given  an  oracle  for  the  PRP  in  both  the  forwards  or  backwards  directions.  After  reading  this  chapter 
you  might  want  to  consider  why  this  is. 
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Each  use  of  the  round  function  employs  a  round  key  k{  for  1  <  i  <  r  derived  from  the  main 
secret  key  k ,  using  an  algorithm  called  a  key  schedule.  To  allow  decryption,  for  every  round  key 
the  function  implementing  the  round  must  be  invertible,  and  for  decryption  the  round  keys  are 
used  in  the  opposite  order  to  that  in  which  they  were  used  for  encryption.  That  the  whole  round  is 
invertible  does  not  imply  that  the  functions  used  to  implement  the  round  need  to  be  invertible.  This 
may  seem  strange  at  first  reading  but  will  become  clearer  when  we  discuss  the  DES  cipher  later.  In 
DES  the  functions  needed  to  implement  the  round  function  are  not  invertible,  but  the  whole  round 
is  invertible.  For  AES  not  only  is  the  whole  round  function  invertible  but  every  function  used  to 
create  the  round  function  is  also  invertible. 

There  are  a  number  of  general-purpose  techniques  which  can  be  used  to  break  a  block  cipher, 
for  example:  exhaustive  search,  using  pre-computed  tables  of  intermediate  values  or  divide  and 
conquer.  Some  (badly  designed)  block  ciphers  can  be  susceptible  to  chosen  plaintext  attacks, 
where  encrypting  a  specially  chosen  plaintext  can  reveal  properties  of  the  underlying  secret  key.  In 
cryptanalysis  one  needs  a  combination  of  mathematical  and  puzzle-solving  skills,  plus  luck.  There 
are  a  few  more  advanced  techniques  which  can  be  employed: 

•  Differential  Cryptanalysis:  In  differential  cryptanalysis  one  looks  at  ciphertext  pairs, 
where  the  corresponding  plaintexts  have  a  particular  difference.  The  exclusive-or  of  such 
pairs  is  called  a  differential  and  certain  differentials  have  certain  probabilities  associated 
with  them,  depending  on  what  the  key  is.  By  analysing  the  probabilities  of  the  differentials 
computed  in  a  chosen  plaintext  attack  one  can  hope  to  reveal  the  underlying  structure  of 
the  key. 

•  Linear  Cryptanalysis:  Even  though  a  good  block  cipher  should  contain  non-linear 
components  the  idea  behind  linear  cryptanalysis  is  to  approximate  the  behaviour  of  the 
non-linear  components  with  linear  functions.  Again  the  goal  is  to  use  a  probabilistic 
analysis  to  determine  information  about  the  key. 

Surprisingly  these  two  methods  are  quite  successful  against  some  ciphers.  Both  DES  and  AES  are 
designed  to  resist  differential  cryptanalysis,  whereas  AES  is  designed  to  also  resist  linear  cryptanal¬ 
ysis. 

Since  DES  and  AES  are  likely  to  be  the  most  important  block  ciphers  in  use  for  the  next  few 
years  we  shall  study  them  in  some  detail.  We  also  do  this  as  they  both  show  general  design  principles 
in  their  use  of  substitutions  and  permutations.  Recall  that  the  historical  ciphers  in  Chapter  7  made 
use  of  such  operations,  so  we  see  that  not  much  has  changed.  Now,  however,  the  substitutions  and 
permutations  used  are  far  more  intricate.  On  their  own  they  do  not  produce  security,  but  when 
used  over  a  number  of  rounds  one  can  obtain  enough  security  for  our  applications. 

We  end  this  section  by  discussing  the  question,  which  is  best,  a  block  cipher  or  a  stream  cipher? 
The  main  difference  between  a  block  cipher  and  a  stream  cipher  is  that  block  ciphers  are  stateless, 
whilst  stream  ciphers  maintain  an  internal  state  which  is  needed  to  determine  which  part  of  the 
keystream  should  be  generated  next.  Here  are  just  a  few  general  points. 

•  Block  ciphers  are  more  general,  and  we  shall  see  that  one  can  easily  turn  a  block  cipher 
into  a  stream  cipher. 

•  Stream  cipher  designs  generally  have  a  more  mathematical  structure.  This  either  makes 
them  easier  to  break  or  easier  to  study  to  convince  oneself  that  they  are  secure. 

•  Stream  ciphers  are  generally  not  suitable  for  software,  since  they  usually  encrypt  one  bit 
at  a  time.  However,  stream  ciphers  are  highly  efficient  in  hardware. 

•  Block  ciphers  are  suitable  for  both  hardware  and  software,  but  are  generally  not  as  fast 
in  hardware  as  stream  ciphers. 
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•  Hardware  is  always  faster  than  software,  but  this  performance  improvement  comes  at  the 
cost  of  less  flexibility. 

•  One  can  use  block  ciphers  to  build  more  complex  functions  via  modes  of  operation,  and 
rigorously  analyse  them  if  we  assume  the  block  cipher  is  a  secure  PRP. 

13.2.  Feistel  Ciphers  and  DES 

The  DES  cipher  is  a  variant  of  the  basic  Feistel  cipher  described  in  Figure  13.2.  Feistel  ciphers 
are  named  after  H.  Feistel,  who  worked  at  IBM  and  performed  some  of  the  earliest  non-military 
research  on  encryption  algorithms.  The  interesting  property  of  a  Feistel  cipher  is  that  the  round 
function  is  invertible  regardless  of  the  choice  of  the  function  in  the  box  marked  F.  To  see  this 
notice  that  each  encryption  round  is  given  by 

Li  <—  Ri_  i, 

Ri  Li_ i  0  F(Ki,  Ri_ i). 

Hence,  the  decryption  can  be  performed  via 

Ri— 1  Li , 

Li- 1  <—  Ri  0  F(Ki,  Li). 


Plaintext  block 


L 


o 


R< 


o 


Ciphertext  block 


R 


r 


F 


Figure  13.2.  Basic  operation  of  a  Feistel  cipher 


This  means  that  in  a  Feistel  cipher  we  have  simplified  the  design  somewhat,  since 

•  we  can  choose  any  function  for  the  function  T,  and  we  will  still  obtain  an  encryption 
function  which  can  be  inverted  using  the  secret  key, 

•  the  same  code/circuitry  can  be  used  for  the  encryption  and  decryption  functions.  We  only 
need  to  use  the  round  keys  in  the  reverse  order  for  decryption. 

Of  course  to  obtain  a  secure  cipher  we  still  need  to  take  care  with 

•  how  the  round  keys  are  generated, 

•  how  many  rounds  to  take, 

•  how  the  round  function  F  is  defined. 


Work  on  DES  was  started  in  the  early  1970s  by  a  team  in  IBM  which  included  Horst  Feistel.  It 
was  originally  based  on  an  earlier  cipher  of  IBM’s  called  Lucifer,  but  some  of  the  design  was  known 
to  have  been  amended  by  the  National  Security  Agency  (NS A).  For  many  years  this  led  conspiracy 
theorists  to  believe  that  the  NS  A  had  placed  a  trapdoor  in  the  design  of  the  function  F.  However, 
it  is  now  widely  accepted  that  the  modifications  made  by  the  NS  A  were  done  to  make  the  cipher 
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more  secure.  In  particular,  the  changes  made  by  the  NS  A  made  the  cipher  resistant  to  differential 
cryptanalysis,  a  technique  that  was  not  discovered  in  the  open  research  community  until  the  1980s. 

DES  is  also  known  as  the  Data  Encryption  Algorithm  (DEA)  in  documents  produced  by  the 
American  National  Standards  Institute  (ANSI).  The  International  Organization  for  Standardization 
(ISO)  refers  to  DES  by  the  name  DEA-1.  It  was  a  worldwide  standard  for  around  thirty  years 
and  stands  as  the  first  publicly  available  cryptographic  algorithm  to  have  an  “official  status”.  It 
therefore  marks  an  important  step  on  the  road  from  cryptography  being  a  purely  military  area  to 
being  a  tool  for  the  masses.  The  use  of  DES  is  now  no  longer  recommended  on  its  own,  and  it  has 
been  withdrawn  from  all  standards. 

The  basic  properties  of  the  DES  cipher  are  that  it  is  a  variant  of  the  Feistel  cipher  design  in 
which 

•  the  number  of  rounds  r  is  16, 

•  the  block  length  n  is  64  bits, 

•  the  key  length  is  56  bits, 

•  the  round  keys  K\, . . . ,  are  each  48  bits. 

Note  that  a  key  length  of  56  bits  is  insufficient  for  many  modern  applications,  hence  often  one  uses 
DES  by  using  three  keys  and  three  iterations  of  the  main  cipher.  Such  a  version  is  called  triple 
DES  or  3DES;  see  Figure  13.3. 


Plaintext 


3  Ciphertext 


Figure  13.3.  Triple  DES 

In  3DES  the  key  length  is  equal  to  168.  There  is  another  way  of  using  DES  three  times,  but 
using  two  keys  instead  of  three  giving  rise  to  a  key  length  of  112.  In  this  two-key  version  of  3DES, 
one  uses  the  3DES  basic  structure  but  with  the  first  and  third  key  being  equal.  However,  two-key 
3DES  is  not  as  secure  as  one  might  inititally  think.  Intuitively  one  might  suspect  that  it  has  a  key 
size  of  112  bits,  however  one  can  break  it  with  264  effort. 

Theorem  13.1.  Two-key  3DES  can  be  broken  in  about  264  time  and  about  264  space,  with  a  chosen 
plaintext  attack . 

Proof.  The  two- key  variant  of  3DES  is  given  by  the  equation 

DESkl(DESk-\DESkl{m))). 

The  technique  to  break  this  variant  is  a  standard  time/memory  trade-off  algorithm,  which  is  very 
similar  to  the  Baby-Step/Giant-Step  algorithm  of  Chapter  3. 

The  attacker  executes  the  following  steps: 

(1)  For  all  ti  <E  K  we  compute  cq  <—  DESVt_1(0)?  where  0  is  the  all-zero  message  block, 

(2)  We  store  the  264  tuples  (aqG)  in  a  table. 

(3)  For  each  tuple,  we  submit  each  cq  as  a  plaintext  to  our  chosen  plaintext  attack  oracle.  We 
obtain 

Ci  <-  DESkl{DESk-\DESkl{ai))). 

(4)  For  each  value  we  compute  b{  <—  DESq-1(<q). 
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(5)  Using  the  table,  we  look  for  pairs  (aj5  bi)  for  which  aj  =  bf,  this  can  be  done  fast  for  each 
bi  by  sorting  or  hashing  the  initial  table. 

(6)  Output  ( ti,tj )  as  a  possible  key,  which  can  be  tested  with  a  few  further  chosen  plaintext 
attack  oracle  queries. 

To  see  why  this  attack  works,  consider  the  following  series  of  identities, 
DESti{DESt-l{DESti(ai)))  =  DESti(DESt-\DESti{DESt-\ 0))))  by  step  one 

=  DESt ,  {DESt~l{ 0))  by  definition  of  DES 

=  DEStx((ij)  by  step  one 

=  DEStt(bi)  by  step  five 

=  DEStiiDESt^1^))  by  step  four 

=  Ci  by  definition  of  DES. 


This  is  exactly  what  the  chosen  plaintext  oracle  outputs.  Thus  it  is  highly  likely  that  (^1,^2) 
( U,tj ),  which  can  be  confirmed  by  encrypting  a  few  more  plaintexts  as  in  step  six. 


□ 


A  similar  time/memory  trade-off  can  be  applied  to  the  full  3DES  algorithm.  However,  the  result 
is  that  the  effective  complexity  is  2112.  Thus  3DES  is  still  considered  secure,  just  not  as  secure  as 
one  would  expect  from  its  key  size  of  168  bits,  but  we  have  the  expectation  that 


Adv{3 *DESk}K(A)  ~  1/2 
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13.2.1.  Overview  of  DES  Operation:  Basically  DES  is  a  Feistel  cipher  with  16  rounds,  except 
that  before  and  after  the  main  Feistel  iteration  a  permutation  is  performed,  as  depicted  in  Fig¬ 
ure  13.4.  This  permutation  appears  to  produce  no  change  in  the  security,  and  people  have  often 
wondered  why  it  is  there.  One  answer  given  by  one  of  the  original  team  members  was  that  this 
permutation  was  there  to  make  the  original  implementation  easier  to  fit  on  the  circuit  board. 

In  summary  the  DES  cipher  operates  on  64  bits  of  plaintext  in  the  following  manner: 

•  Perform  an  initial  permutation. 

•  Split  the  blocks  into  left  and  right  half. 

•  Perform  16  rounds  of  identical  operations. 

•  Join  the  half  blocks  back  together. 

•  Perform  a  final  inverse  permutation. 

The  final  permutation  is  the  inverse  of  the  initial  permutation;  this  allows  the  same  hardware 
and/or  software  to  be  used  for  encryption  and  decryption.  The  key  schedule  provides  16  round 
keys  of  48  bits  in  length  by  selecting  48  bits  from  the  56-bit  main  key.  We  shall  now  describe  the 
operation  of  the  function  F  in  more  detail.  In  each  DES  round  this  consists  of  the  following  six 
stages: 

•  Expansion  Permutation:  The  right  half  of  32  bits  is  expanded  and  permuted  to  48  bits. 
This  helps  the  diffusion  of  any  relationship  of  input  bits  to  output  bits.  The  expansion 
permutation  (which  is  different  from  the  initial  permutation)  has  been  chosen  so  that 
one  bit  of  input  affects  two  substitutions  in  the  output,  via  the  S-Boxes  below.  This 
helps  spread  dependencies  and  creates  an  avalanche  effect  (a  small  difference  between  two 
plaintexts  will  produce  a  very  large  difference  in  the  corresponding  ciphertexts). 

•  Round  Key  Addition:  The  48-bit  output  from  the  expansion  permutation  is  exclusive- 
or’d  with  the  round  key,  which  is  also  48  bits  in  length.  Note  that  this  is  the  only  place 
where  the  round  key  is  used  in  the  algorithm. 

•  Splitting:  The  resulting  48-bit  value  is  split  into  eight  lots  of  six-bit  values. 
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Plaintext  block 

IP 
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Ro 

Iterate  16 
times 
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J-Jy* 

IP"1 

Ciphertext  block 

Figure  13.4.  DES  as  a  Feistel  cipher 


•  S-Boxes:  Each  six-bit  value  is  passed  into  one  of  eight  different  S-Boxes  (Substitution 
Box)  to  produce  a  four-bit  result.  The  S-Boxes  represent  the  non-linear  component  in  the 
DES  algorithm  and  their  design  is  a  major  contributor  to  the  algorithm’s  security.  Each 
S-Box  is  a  look-up  table  of  four  rows  and  sixteen  columns.  The  six  input  bits  specify  which 
row  and  column  to  use.  Bits  1  and  6  generate  the  row  number,  whilst  bits  2,3,4  and  5 
specify  the  column  number.  The  output  of  each  S-Box  is  the  value  held  in  that  element 
in  the  table. 

•  P-Box:  We  now  have  eight  lots  of  four-bit  outputs  which  are  then  combined  into  a  32-bit 
value  and  permuted  to  form  the  output  of  the  function  F. 

The  overall  structure  of  the  DES  F  function  is  explained  in  Figure  13.5. 


Figure  13.5.  Structure  of  the  DES  function  F 


We  now  give  details  of  each  of  the  steps  which  we  have  not  yet  fully  defined. 
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Initial  Permutation  IP:  The  DES  initial  permutation  is  defined  in  the  following  table.  Here  the 
58  in  the  first  position  means  that  the  first  bit  of  the  output  from  the  IP  is  the  58th  bit  of  the 
input,  and  so  on. 


58 

50 

42 

34 

26 

18 

10 

2 

60 

52 

44 

36 

28 

20 

12 

4 

62 

54 

46 

38 

30 

22 

14 

6 

64 

56 

48 

40 

32 

24 

16 

8 

57 

49 

41 

33 

25 

17 

9 

1 

59 

51 

43 

35 

27 

19 

11 

3 

61 

53 

45 

37 

29 

21 

13 

5 

63 

55 

47 

39 

31 

23 

15 

7 

The  inverse  permutation  is  given  in  a  similar  manner  by  the  following  table. 


40 

8 

48 

16 

56 

24 

64 

32 

39 

7 

47 

15 

55 

23 

63 

31 

38 

6 

46 

14 

54 

22 

62 

30 

37 

5 

45 

13 

53 

21 

61 

29 

36 

4 

44 

12 

52 

20 

60 

28 

35 

3 

43 

11 

51 

19 

59 

27 

34 

2 

42 

10 

50 

18 

58 

26 

33 

1 

41 

9 

49 

17 

57 
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Expansion  Permutation  E:  The  expansion  permutation  is  given  in  the  following  table.  Each 
row  corresponds  to  the  bits  which  are  input  into  the  corresponding  S-Box  at  the  next  stage.  Notice 
how  the  bits  which  select  the  row  of  one  S-Box  (the  first  and  last  bit  on  each  row)  are  also  used  to 
select  the  column  of  another  S-Box. 


32 

1 

2 

3 

4 

5 

4 

5 

6 

7 

8 

9 

8 

9 

10 

11 

12 

13 

12 

13 

14 

15 

16 

17 

16 

17 

18 

19 

20 

21 

20 

21 

22 

23 

24 

25 

24 

25 

26 

27 

28 

29 

28 

29 

30 

31 

32 

1 

S-Box:  The  details  of  the  eight  DES  S-Boxes  are  given  in  Figure  13.6.  Recall  that  each  box  consists 
of  a  table  with  four  rows  and  sixteen  columns. 


The  P-Box  Permutation  P:  The  P-Box  permutation  takes  the  eight  lots  of  four-bit  nibbles, 
output  by  the  S-Boxes,  and  produces  a  32-bit  permutation  of  these  values  as  given  by  the  following 
table. 
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14 

4 

13 

1 

2 

15 

11 

S-Box  1 

8  3 

10 

6 

12 

5 

9 

0 

7 

0 

15 

7 

4 

14 

2 

13 

1 

10 

6 

12 

11 

9 

5 

3 

8 

4 

1 

14 

8 

13 

6 

2 

11 

15 

12 

9 

7 

3 

10 

5 

0 

15 

12 

8 

2 

4 

9 

1 

7 

5 

11 

3 

14 

10 

0 

6 

13 

15 

1 

8 

14 

6 

11 

3 

S-Box  2 

4  9 

7 

2 

13 

12 

0 

5 

10 

3 

13 

4 

7 

15 

2 

8 

14 

12 

0 

1 

10 

6 

9 

11 

5 

0 

14 

7 

11 

10 

4 

13 
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15 
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8 
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11 
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7 

12 
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5 

14 

9 

10 

0 

9 

14 

6 

3 

15 

S-Box  3 
5  1 

13 

12 

7 

11 

4 

2 

8 

13 
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4 

6 

10 

2 

8 

5 

14 

12 

11 

15 

1 

13 

6 

4 

9 

8 

15 

3 
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11 

1 

2 
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10 

13 
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6 

9 
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7 

4 

15 

14 

3 

11 

5 

2 

12 

S-Box  4 


7 
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14 
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6 

15 
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4 

7 

2 

12 

1 

10 

14 

9 

10 

6 

9 
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3 
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5 
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1 
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11 

S-Box  5 
6  8 

5 

3 

15 
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0 

14 

9 

14 

11 

2 

12 

4 

7 
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15 
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S-Box  6 
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3 
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S-Box  7 
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Figure  13.6.  DES  S-Boxes 
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DES  Key  Schedule:  The  DES  key  schedule  takes  the  56-bit  key,  which  is  actually  input  as  a 
bitstring  of  64  bits  comprising  of  the  key  and  eight  parity  bits,  for  error  detection.  These  parity 
bits  are  in  bit  positions  8, 16, . . . ,  64  and  ensure  that  each  byte  of  the  key  contains  an  odd  number 
of  bits  set  to  one.  We  first  permute  the  bits  of  the  key  according  to  the  following  permutation 
(which  takes  a  64-bit  input  and  produces  a  56-bit  output,  hence  discarding  the  parity  bits). 
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The  output  of  this  permutation,  called  PC-1  in  the  literature,  is  divided  into  a  28-bit  left  half  Co 
and  a  28-bit  right  half  Dq.  Now  for  each  round  we  compute 

C{  <—  C{- 1  Pi, 

D%  i  Di-1  Pi-) 

where  x  pi  means  perform  a  cyclic  shift  on  x  to  the  left  by  pi  positions.  If  the  round  number  i 
is  1,2,9  or  16  then  we  shift  left  by  one  position,  otherwise  we  shift  left  by  two  positions.  Finally 
the  two  portions  C{  and  Di  are  joined  back  together  and  are  subject  to  another  permutation,  called 
PC-2,  to  produce  the  final  48-bit  round  key.  The  permutation  PC-2  is  described  below. 
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13.3.  AES 

The  AES  winner  was  decided  in  autumn  2000  to  be  the  Rijndael  algorithm  designed  by  Joan  Daemen 
and  Vincent  Rijmen.  AES  is  a  block  cipher  which  does  not  rely  on  the  basic  design  of  the  Feistel 
cipher;  instead  it  is  designed  as  a  substitution-permutation  network,  or  SP-network.  However,  AES 
does  have  a  number  of  similarities  with  DES.  Block  ciphers  based  on  the  SP-network  design  consist 
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of  a  series  of  rounds,  each  of  which  consists  of  a  key  addition  phase,  a  substitution  phase  and  a 
permutation  phase.  The  idea  is  that  the  permutation  phase  aims  to  produce  an  avalanche  effect,  by 
spreading  out  differences  in  the  input  to  other  parts  of  the  state  as  quickly  as  possible,  performing 
a  process  called  diffusion.  The  substitution  phase  is  the  main  non-linear  component  and  this  aims 
to  introduce  as  much  non-linearity,  or  confusion ,  into  the  output  as  possible. 

AES  has,  unlike  DES,  a  strong  mathematical  structure,  as  most  of  its  operations  are  based 
on  arithmetic  in  the  finite  fields  F28  and  F2.  However,  unlike  DES  the  encryption  and  decryption 
operations  are  distinct,  and  do  not  just  require  a  re-ordering  of  the  round  keys. 


Recall  from  Chapter  6  that  elements  of  F2s  are  stored  as  bit  vectors  (or  bytes)  representing 
binary  polynomials.  For  example  the  byte  given  by  0x83  in  hexadecimal  gives  the  bit  pattern 

1,0, 0,0,  0,0, 1,1 


since 


0x83  =  8  •  16  +  3  =  131 


in  decimal.  One  can  obtain  the  bit  pattern  directly  by  noticing  that  8  in  binary  is  1, 0, 0,  0  and  3  in 
Tbit  binary  is  0, 0, 1, 1  and  one  simply  concatenates  these  two  bit  strings  together.  The  bit  pattern 
itself  then  corresponds  to  the  binary  polynomial 

7 

x  T  x  -f~  1 . 


So  we  say  that  the  hexadecimal  number  0x83  represents  the  binary  polynomial 

7 

x  x  T  1  • 


Arithmetic  in  F2s  in  the  AES  algorithm  is  performed  using  polynomial  arithmetic  modulo  the 
irreducible  polynomial 

m(x)  =  x8  +  x4  +  x3  +  x  +  1. 

AES  identifies  32-bit  words  with  polynomials  in  F2s[X]  of  degree  less  than  four.  This  is  done  in  a 
big-endian  format,  in  that  the  smallest  index  corresponds  to  the  least  important  coefficient.  Hence, 
the  word 


will  correspond  to  the  polynomial 


CL o  CL  i  tt2  0-3 


Q-3  •  Xs  T  o<2  •  X2  T  ci\  '  X  T  clq. 


Arithmetic  is  performed  on  polynomials  in  F2s[A]  modulo  the  reducible  polynomial 

M(X)  =  X4  +  1. 

Hence,  arithmetic  is  done  on  these  polynomials  in  a  ring  rather  than  a  field,  since  M(X)  is  reducible. 


Rijndael  was  a  parametrized  algorithm,  in  that  it  could  operate  on  block  sizes  of  128,  192  or 
256  bits,  but  in  the  final  AES  standard  the  block  size  was  fixed  at  128  bits.  However,  AES  does 
support  keys  of  size  128,  192  or  256  bits.  For  each  key  size  a  different  number  of  rounds  is  specified. 
To  make  our  discussion  simpler  we  shall  consider  the  simpler,  and  probably  more  used,  variant 
which  uses  a  block  size  of  128  bits  and  a  key  size  of  128  bits,  in  which  case  10  rounds  are  specified. 
From  now  on  our  discussion  is  only  of  this  simpler  version. 

AES  operates  on  an  internal  four-by-four  matrix  of  bytes,  called  the  state  matrix 


(  50,0 

s0,l 

So, 2 
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which  is  usually  held  as  a  vector  of  four  32-bit  words,  each  word  representing  a  column.  Each  round 
key  is  also  held  as  a  four- by- four  matrix 

60.2  &o,3  \ 

61.2  &1,3 

62.2  &2,3 

63.2  &3,3  / 

13.3.1.  AES  Operations:  The  AES  round  function  operates  using  a  set  of  four  operations  which 
we  shall  now  describe. 


/  &o,o  fco.i 
&1,0  &1,1 
&2,  0  &2,1 

\  &3,0  &3,1 


SubBytes:  Two  types  of  S-Boxes  are  used  in  AES:  one  for  the  encryption  rounds  and  one  for  the 
decryption  rounds,  each  one  being  the  inverse  of  the  other.  We  shall  describe  the  encryption  S-Box; 
the  decryption  one  will  follow  immediately.  The  S-Boxes  of  DES  were  chosen  by  searching  through 
a  large  space  of  possible  S-Boxes,  so  as  to  avoid  attacks  such  as  differential  cryptanalysis.  The  S- 
Box  of  AES  is  chosen  to  have  a  simple  mathematical  structure,  which  allows  one  to  formally  argue 
how  resilient  the  cipher  is  to  differential  and  linear  cryptanalysis.  Not  only  does  this  mathematical 
structure  help  protect  against  differential  cryptanalysis,  but  it  also  convinces  users  that  it  has  not 
been  engineered  with  some  hidden  trapdoor. 

Each  byte  s  =  [57, . . . ,  sq]  of  the  AES  state  matrix  is  taken  in  turn  and  considered  as  an  element 
of  F2s.  The  S-Box  can  be  mathematically  described  in  two  steps: 

(1)  The  multiplicative  inverse  of  5  in  F2s  is  computed,  to  produce  a  new  byte  x  —  [#7, . . . ,  xq  . 
For  the  element  [0,  ...,0],  which  has  no  multiplicative  inverse,  one  uses  the  convention 
that  this  is  mapped  to  zero,  so  as  to  maintain  a  one-to-one  mapping  from  the  input  to  the 
output  of  the  S-Box. 

The  bit  vector  x  is  then  mapped,  via  the  following  affine  F2  transformation,  to  the  bit- 
vector  y: 
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The  new  byte  is  given  by  y.  The  decryption  S-Box  is  obtained  by  first  inverting  the  affine  trans¬ 
formation  and  then  taking  the  multiplicative  inverse.  These  byte  substitutions  can  either  be  im¬ 
plemented  using  table  look-up  or  by  implementing  circuits,  or  code,  which  implement  the  inverse 
operation  in  F2s  and  the  affine  transformation. 


ShiftRows:  The  ShiftRows  operation  in  AES  performs  a  cyclic  shift  on  the  state  matrix.  Each 
row  is  shifted  by  different  offsets.  For  AES  this  is  given  by 


s0,0 

50,1 

50,2 

50,3 

> 

(  s0,0 

50,1 

50,2 

50,3 

\ 

5l,0 

su 

Sl,2 

Sl,3 

1 - > 

*1,1 

«1,2 

«1,3 

51,0 

^2,0 

s2,l 

52,2 

52,3 

52,2 

52,3 

52,0 

s2,l 

V  s3,0 

S3,l 

53,2 

53,3 

J 

V  s3,3 

53,0 

S3,l 

53,2 

/ 

13.3.  AES 


253 


The  inverse  of  the  ShiftRows  operation  is  simply  the  equivalent  shift  in  the  opposite  direction.  The 
ShiftRows  operation  ensures  that  the  columns  of  the  state  matrix  “interact”  with  each  other  over 
a  number  of  rounds. 


MixColumns:  The  MixColumns  operation  ensures  that  the  rows  in  the  state  matrix  “interact” 
with  each  other  over  a  number  of  rounds;  combined  with  the  ShiftRows  operation  it  ensures  each 
byte  of  the  output  state  depends  on  each  byte  of  the  input  state.  We  consider  each  column  of  the 
state  [ao,  ai,  <22,  as]  in  turn,  and  consider  it  as  a  polynomial  of  degree  less  than  four  with  coefficients 
in  F2s.  The  new  column  [bo,  b\,  62 5  63]  is  produced  by  taking  the  polynomial 


ai^X^j  —  do  T  d\  '  X  T  Ci2  ’  X 2  T  0.3  •  XK 


and  multiplying  it  by  the  polynomial 

c(X)  =  0x02  +  0x01  •  X  +  0x01  •  X2  +  0x03  •  X3 


modulo 

Af(X)  =  X4  +  1. 

This  operation  is  conveniently  represented  by  the  following  matrix  operation  in  F2s, 


(  bo  ^ 

/  0x02  0x03  0x01  0x01  ^ 

(  ao  \ 

61 

^2 

\  &3  ) 

<0- 

0x01  0x02  0x03  0x01 

0x01  0x01  0x02  0x03 

\  0x03  0x01  0x01  0x02  ) 

ai 

d2 

\  a3  J 

In  F28  the  above  matrix  is  invertible,  hence  the  inverse  of  the  MixColumns  operation  can  also  be 
implemented  using  a  matrix  multiplication  such  as  that  above. 


AddRoundKey:  The  round  key  addition  is  particularly  simple.  One  takes  the  state  matrix  and 
exclusive-or’s  it,  byte  by  byte,  with  the  round  key  matrix.  The  inverse  of  this  operation  is  clearly 
the  same  operation. 


Round  Structure:  The  AES  algorithm  can  now  be  described  using  the  pseudo-code  in  Algorithm 
13.1.  The  message  block  to  encrypt  is  assumed  to  be  entered  into  the  state  matrix  A,  the  output 
encrypted  block  is  also  given  by  the  state  matrix  S.  Notice  that  the  final  round  does  not  perform 
a  MixColumns  operation.  The  corresponding  decryption  operation  is  described  in  Algorithm  13.2. 


Algorithm  13.1:  AES  encryption  outline 


AddRoundKey  (S',  Kq). 
for  i  —  1  to  9  do 
SubBytes  (S). 
ShiftRows  (S). 
MixColumns  (S) . 
AddRoundKey (S,  Ki). 


SubBytes  (S). 

ShiftRows  (S). 
AddRoundKey(S,  K\o). 
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Algorithm  13.2:  AES  decryption  outline 

AddRoundKey(A,  K\  o). 
InverseShiftRows(S') . 

InverseSubBytes(  S) . 
for  i  —  9  downto  1  do 
AddRoundKey(A,  Ki). 
InverseMixColumns  (S') . 
InverseShiftRows(S). 
InverseSubBytes(  S) . 

AddRoundKey(S,  Ao). 


AES  Key  Schedule:  The  only  thing  left  to  describe  is  how  AES  computes  the  round  keys  from 
the  main  key.  Recall  that  the  main  key  is  128  bits  long,  and  we  need  to  produce  11  round  keys 
Ao,  . . . ,  An  all  of  which  consist  of  four  32-bit  words,  each  word  corresponding  to  a  column  of  a 
matrix  as  described  above.  The  key  schedule  makes  use  of  a  round  constant  which  we  shall  denote 

by 

RCi  <—  x 1  (mod  x8  +  x4  +  x3  +  x  +  1). 

We  label  the  round  keys  as  (10^,  W^+i,  IUu+2,  fEu+3)  where  i  is  the  round.  The  initial  main 
key  is  first  divided  into  four  32-bit  words  (&o,  &i,  &2>  £3).  The  round  keys  are  then  computed  as  in 
Algorithm  13.3,  where  Rot  Bytes  is  the  function  which  rotates  a  word  to  the  left  by  a  single  byte, 
and  SubBytes  applies  the  AES  encryption  S-Box  to  every  byte  in  a  word. 


Algorithm  13.3:  AES  key  schedule 

Wo  ~  A0,  W\  <—  K\,  W2  ~  A2,  VE3  +-  A3. 

for  i  1  to  10  do 

T  Rot  Bytes  (VE^-i). 

T  SubBytes(T). 

T^TeRCi. 

W44  ^ —  BAz— 4  0  T . 

BAz+i  W^i— 3  0  W42. 

1042+2  1042-2  0  1042+D 

1042+3  1042-1  0  1042+2- 


13.4.  Modes  of  Operation 

A  block  cipher  like  DES  or  AES  can  be  used  in  a  variety  of  ways  to  encrypt  a  data  string.  Soon 
after  DES  was  standardized  another  US  Federal  standard  appeared  giving  four  recommended  ways 
of  using  DES  for  data  encryption.  These  modes  of  operation  have  since  been  standardized  inter¬ 
nationally  and  can  be  used  with  any  block  cipher.  The  four  modes  are 

•  ECB  Mode:  This  is  simple  to  use,  but  suffers  from  possible  deletion  and  insertion  attacks. 
A  one-bit  error  in  the  ciphertext  gives  a  one  whole  block  error  in  the  decrypted  plaintext. 

•  CBC  Mode:  This  is  probably  the  best  of  the  original  four  modes  of  operation,  since 
it  helps  protect  against  deletion  and  insertion  attacks.  In  this  mode  a  one-bit  error  in 
the  ciphertext  gives  not  only  a  one-block  error  in  the  corresponding  plaintext  but  also  a 
one-bit  error  in  the  next  decrypted  plaintext  block. 
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•  OFB  Mode:  This  mode  turns  a  block  cipher  into  a  stream  cipher.  It  has  the  property 
that  a  one-bit  error  in  the  ciphertext  gives  a  one-bit  error  in  the  decrypted  plaintext. 

•  CFB  Mode:  This  mode  also  turns  a  block  cipher  into  a  stream  cipher.  A  single  bit  error 
in  the  ciphertext  affects  both  this  block  and  the  next,  just  as  in  CBC  mode. 

Over  the  years  various  other  modes  of  operation  have  been  presented.  Probably  the  most  popular 
of  the  more  modern  modes  is 

•  CTR  Mode:  This  also  turns  the  block  cipher  into  a  stream  cipher,  but  it  enables  blocks  to 
be  processed  in  parallel,  thus  providing  performance  advantages  when  parallel  processing 
is  available. 

We  shall  now  describe  each  of  these  five  modes  of  operation  in  detail,  and  show  what  security  prop¬ 
erties  each  has  (or  in  most  cases  has  not).  Finally,  we  present  a  summary  of  these  five  basic  modes 
of  operation  in  Tables  13.1  and  13.2.  Throughout  this  section  we  ignore  difficulties  related  to  which 
padding  scheme  should  be  used  to  pad  a  message  out  to  a  multiple  of  the  block  length.  We  discuss 
padding  schemes  in  Chapter  14.  In  the  following  discussion  we  let  ECB[F],  CBC[F],  OFB[F],  CFB[F] 
and  CTR[F]  denote  the  mode  of  operation  when  instantiated  with  the  function  F,  which  could  be 
a  block  cipher,  a  pseudo-random  permutation  or  a  pseudo-random  function  depending  on  the  sit¬ 
uation  we  are  considering. 

13.4.1.  ECB  Mode:  Electronic  Code  Book  Mode,  or  ECB  Mode,  is  the  simplest  way  to  use  a 
block  cipher.  The  data  to  be  encrypted  m  is  divided  into  blocks  of  n  bits: 

mi,  m2, . . . ,  mq 

with  the  last  block  padded  if  needed.  The  ciphertext  blocks  ci, . . . ,  cq  are  then  defined  as  follows 

Ci  <r-  ek(mi), 

as  described  in  Figure  13.7.  Decipherment  is  simply  the  reverse  operation  as  explained  in  Fig¬ 
ure  13.8. 


mi 


m2 


m3 


&k 

&k 

y 

c  1 


C2 


C3 


Figure  13.7.  ECB  encipherment 

ECB  Mode  has  a  number  of  problems:  the  first  is  due  to  the  property  that  if  mi  =  rrij  then  we 
have  Ci  =  Cj ,  i.e.  the  same  input  block  always  generates  the  same  output  block.  This  is  a  problem 
in  practice  since  stereotyped  beginnings  and  endings  of  messages  are  common.  The  second  problem 
comes  because  we  could  simply  delete  blocks  from  the  message  and  no  one  would  know.  Thirdly 
we  could  replay  known  blocks  from  other  messages.  By  extracting  ciphertext  corresponding  to  a 
known  piece  of  plaintext  we  can  then  amend  other  transactions  to  contain  this  known  block  of  text. 
In  terms  of  our  security  models  from  Chapter  11  we  have  the  following. 

Theorem  13.2.  ECB  Mode  is  not  IND-PASS  secure. 
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Figure  13.8.  ECB  decipherment 


Proof.  To  prove  something  is  not  secure  we  need  to  exhibit  an  attack  within  the  model.  The 
attack  on  ECB  mode  is  very  simple: 


Let  0  denote  the  block  of  all  zeros,  and  1  denote  the  block  of  all  ones. 

Call  the  Olr  oracle  with  mo  =  0 1|  1  and  m\  =  1||1.  The  challenge  ciphertext  c*  is  returned 
which  is  the  encryption  of  ra&,  for  the  hidden  bit  b.  The  challenge  ciphertext  consists  of 
two  blocks  Co  and  c\ 

If  cq  7^  ci  then  output  b'  —  0,  else  return  b'  —  1. 


□ 


It  is  clear  this  attack  works  with  a  one  hundred  percent  success  rate,  this  is  due  to  the  fact  that 
ECB  Mode  is  deterministic.  One  should  also  note  that  this  attack  is  stronger  than  the  standard 
attack  on  deterministic  encryption  schemes  (which  require  access  to  an  encryption  oracle).  The 
ECB  attack  does  not  require  such  an  oracle.  Even  if  we  restrict  ourseleves  to  a  notion  of  one-way 
security  ECB  mode  is  not  as  secure  as  we  would  like 

Theorem  13.3.  ECB  Mode  is  not  OW-CCA  secure. 


Proof.  Again,  to  prove  something  is  not  secure  we  need  to  exhibit  an  attack  within  the  model, 
and  again  the  attack  is  very  simple: 


•  Let  c*  denote  the  target  ciphertext  to  be  decrypted. 

•  Let  r  denote  an  arbitrary  block,  and  form  the  ciphertext  c  c* 

•  Pass  c  to  the  decryption  oracle  Odk  to  obtain  the  plaintext  m* 

•  Return  m*. 


r. 

s  for  some  block  s. 


□ 


However,  we  can  show  the  following  positive  result 

Theorem  13.4.  ECB  mode  is  OW-CPA  secure  assuming  the  underlying  block  cipher  e &  acts  like 
a  pseudo-random  permutation.  In  particular,  let  A  denote  an  adversary  against  ECB  Mode  which 
makes  qe  queries  to  its  encryption  oracle,  where  each  query  is  a  single  block  in  length,  and  where 
the  challenge  ciphertext  is  £  distinct  blocks  in  length2 .  There  is  an  adversary  B  such  that 

Adv^(Xi*,)<Ad<|ip(B)  +  ^. 

where  n  is  the  block  size  of  the  cipher  e&. 


9 

These  assumptions  can  be  changed  by  making  a  suitably  more  complex  probability  estimate. 
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Proof.  Our  main  step  in  the  proof  is  to  replace  the  underlying  block  cipher  by  a  pseudo-random 
permutation.  This  can  be  done  by  the  assumption  that  is  a  secure  PRP,  namely  there  is  some 
adversary  B  such  that 


Adv 


PRP 


Pr [A  wins  ECB[e& 


Pt[A  wins  ECBfP]] 


where  we  let  V  denote  a  random  permutation.  We  now  need  to  bound  the  probability  that  A  wins 
this  latter  game,  i.e.  Pr[A  wins  ECB[P]].  It  is  easier  to  bound  the  probability  that  A  does  not  win. 
Since  for  a  PRP  the  adversary  cannot  learn  anything  about  the  output  value  of  the  permutation 
until  she  queries  the  permutation  on  the  specific  input  value,  the  probability  that  she  does  not  win 
is  given  by  the  probability  that  out  of  the  qe  distinct  queries  to  the  encryption  oracle  we  do  not 
obtain  all  of  the  7  blocks  in  the  challenge  ciphertext.  Setting  N  =  2n  this  gives  us,  where  xCy  is 
the  function  which  returns  the  number  of  combinations  of  y  objects  selected  from  n, 


Pr  [A  wins  ECBfP]]  =  1  —  Pr  [A  does  not  win  ECBfP]] 
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(N-iy. 
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Hence. 


Pr  [A  wins  ECB[eJ]  =  Pr  [A  wins  ECB[e^, 


< 


+  (Pr[A  wins  ECB[P]]  —  Pt[A  wins  ECB[P]]) 


Pr  [A  wins  ECB[eJ]  —  Pr  [A  wins  ECBfP]] 


+ 


Pt[A  wins  ECB[P]] 


<  Ad vpe*p(B)+£ 


Qe 


adding  in  zero 


triangle  inequality 


m 


□ 


Notice  that  when  qe  is  small  relative  to  2n  the  probability  qe/2n  is  very  close  to  zero,  whereas  as 
qe  approaches  2n  we  obtain  a  probability  close  to  one. 


13.4.2.  CBC  Mode:  One  way  of  countering  the  problems  with  ECB  Mode  is  to  chain  the  cipher, 
and  in  this  way  add  context  to  each  ciphertext  block.  The  easiest  way  of  doing  this  is  to  use  Cipher 
Block  Chaining  Mode,  or  CBC  Mode.  Again,  the  plaintext  must  first  be  divided  into  a  series  of 
blocks 


777-1  TYlq  -> 

and  as  before  the  final  block  may  need  padding  to  make  the  plaintext  length  a  multiple  of  the  block 
length.  Encryption  is  then  performed  as  in  Figure  13.9,  or  equivalently  via  the  equations 

C1  Ck(ml  © 

c%  Ck(rrii  0  Ci- 1)  for  i  >  1. 


With  the  output  ciphertext  being  7E||ci||c2 


The  transmission  of  IV  with  the  ciphertext  can 


be  dropped  if  the  receiver  will  know  what  the  value  will  be  a  priori. 
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Figure  13.9.  CBC  encipherment 

Decryption  also  requires  the  TV  and  is  performed  as  in  Figure  13.10,  or  via  the  equations 

m\  dk(ci)  ©  IV, 

mi  <-  dk(ci)  0  Ci- 1  for  i  >  1. 


777 1 


m  2 


7773 


Figure  13.10.  CBC  decipherment 

With  ECB  Mode  a  single-bit  error  in  transmission  of  the  ciphertext  will  result  in  a  whole 
block  being  decrypted  wrongly,  whilst  in  CBC  Mode  we  see  that  not  only  will  we  decrypt  a  block 
incorrectly  but  the  error  will  also  affect  a  single  bit  of  the  next  block. 

Notice  that  we  require  an  additional  initial  value  IV.  This  is  either  a  unique,  i.e.  never  repeated, 
value  passed  to  the  encryption  function  (in  which  case  we  are  said  to  have  a  nonce-based  encryption 
scheme),  or  it  is  fixed  to  a  specific  value  (in  which  case  we  have  a  deterministic  variant  of  CBC 
Mode),  or  it  is  a  truly  random  value  chosen  internally  by  the  mode  of  operation.  Each  of  these 
choices  leads  to  entirely  different  security  properties  as  we  shall  now  demonstrate.  When  a  non-fixed 
IV  is  used  the  IV  value  is  prepended  to  the  ciphertext. 

Fixed  IV:  Clearly  with  a  fixed  IV  CBC  Mode  is  deterministic  and  hence  cannot  be  IND-CPA 
secure,  however  one  can  show  it  is  IND-PASS  secure  (unlike  ECB  Mode).  This  follows  from  the 
proof  method  of  IND-CPA  security  in  the  random  IV  case  given  below.  The  crucial  point  to  note  is 
that  whilst  the  adversary  has  control  over  the  input  to  the  first  call  to  the  block  cipher,  he  has  no 
control  over  the  other  calls  when  encrypting  a  multi-block  message. 
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Nonce  IV:  Here  we  think  of  CBC  Mode  as  a  nonce-based  encryption  scheme,  as  in  Section  11.6.4. 
We  start  with  the  negative  result 

Theorem  13.5.  With  a  nonce  as  the  IV,  CBC  Mode  is  not  IND-CPA  secure. 


Proof.  Let  0  be  the  all-zero  block  and  1  be  the  all-one  block.  The  attack  on  the  IND-CPA  security 
is  as  follows: 


Send  the  message  0  with  the  nonce  TV  —  0  to  the  encryption  oracle  Oek.  The  adversary 
obtains  the  ciphertext  0||c  in  return,  where  c  =  e&(0). 

Now  send  the  messages  m o  =  0  and  mi  =  1  to  the  (9|_r  oracle,  with  nonce  1.  Notice  this 


is  a  new  nonce  and  so  the  encryption  is  allowed  in  the  game.  Let  1 
ciphertext. 

If  c*  =  c  then  return  b'  —  1,  else  return  b'  =  0. 


c*  be  the  returned 


To  see  why  this  attack  works,  note  that  if  the  hidden  bit  is  b  =  1  then  the  challenger  returns  c* 
which  is  the  evaluation  of  the  block  cipher  on  the  block  10  1  =  0.  Whereas  if  b  =  0  then  the 
evaluation  is  on  the  block  0  0  1  =  1.  □ 


On  the  positive  side,  when  used  only  once  nonce-based  encryption  is  identical  to  a  fixed  IV,  and  so 
CBC  Mode  used  in  a  nonce-based  encryption  methodology  is  IND-PASS  secure. 

Random  IV:  With  a  random  IV  we  can  be  more  positive,  since  CBC  Mode  is  IND-CPA  secure  as 
we  will  now  show. 

Theorem  13.6.  With  a  random  IV,  CBC  Mode  is  IND-CPA  secure  assuming  the  underlying  block 
cipher  e &  acts  like  a  pseudo-random  permutation.  In  particular  let  A  denote  an  adversary  against 
CBC  Mode  which  makes  qe  queries  to  its  encryption  oracle,  and  let  all  plaintext  submitted  to  both 
the  LR  and  encryption  oracles  be  at  most  I  blocks  in  length.  Then  there  is  an  adversary  B  such 
that 


Adv 


IND-CPA 

CBC[efc] 


(A  qe)  <  Adv 


PRP 


(S)  + 


3  -T2 
2n 


where  n  is  the  block  size  of  the  cipher  e &  and  T  =  (qe  0  1)  •  I. 


Proof.  In  the  security  game  the  challenger  needs  to  call  the  underlying  block  cipher  on  behalf  of 
the  adversary.  The  total  number  of  such  calls  is  bounded  by  T  =  (qe  +  1)  •  I. 

Our  first  step  in  the  proof  is  to  replace  the  underlying  block  cipher  by  a  pseudo-random 
permutation.  This  can  be  done  by  the  assumption  that  is  a  secure  PRP,  namely  there  is  some 
adversary  B  such  that 


Adv 


PRP 

efc 


Pr[A  wins  CBC[e/c 


Pr[A  wins  CBCfP]] 


where  we  let  V  denote  a  random  permutation.  Our  next  step  is  to  switch  from  the  component 
being  a  random  permutation  to  a  random  function.  This  follows  in  the  same  way  as  we  proved  the 
PRF-PRP  Switching  Lemma  (Lemma  11.2).  Suppose  we  replace  V  by  a  random  function  T  in  the 
CBC  game  and  we  let  E  denote  the  event,  during  the  game  CBCfJ7],  that  the  adversary  makes  two 


260 


13.  BLOCK  CIPHERS  AND  MODES  OF  OPERATION 


calls  to  7  which  result  in  the  same  output  value.  We  have 


Pi[A  wins  CBC [7^]]  —  Pi[A  wins  CBC[7 


(20) 


Pi[A  wins  CBC [7^]] 

—  Pt[A  wins  CBC  [7]  A  ~^E] 

—  Pt[A  wins  CBC  [7]  A  E\ 

Pi[A  wins  CBC[7]]  —  Pi[A  wins  CBC[7]] 


Pt[A  wins  CBC  [7]  \  E]  •  Pr[7] 

rp  2 


<  Pr[7]  < 


2n+l 


since  if  E  does  not  happen  the  two  games  are  identical  from  the  point  of  view  of  the  adversary, 
and  by  the  birthday  bound  Pr  [E\  <  ^e+t- 

Our  hnal  task  is  to  bound  the  probability  of  A  winning  the  CBC  game  when  the  underlying 
“block  cipher”  is  a  random  function.  First  let  us  consider  how  the  challenger  works  in  the  game 
CBC [7*].  When  the  adversary  makes  an  O i_r  or  Oek  call,  the  challenger  answers  the  query  by  calling 
the  random  function.  As  we  are  dealing  with  a  random  function,  and  not  a  random  permutation, 
the  challenger  can  select  the  output  value  of  E  independently  from  the  codomain;  i.e.  it  does  not 
need  to  adjust  the  output  values  depending  on  the  previous  values.  This  last  point  will  make  our 
analysis  simpler,  and  is  why  we  switched  to  the  PRF  game  from  the  PRP  game. 

Now  notice  that  the  adversary  does  not  control  the  inputs  to  the  random  function  at  any  stage 
in  the  game,  so  the  only  way  he  can  find  any  information  is  by  creating  an  input  collision,  i.e.  two 
calls  the  challenger  makes  to  the  random  function  are  on  the  same  input  values3. 

We  thus  let  Mj  denote  the  event  that  the  adversary  makes  an  input  collision  happen  within 
the  first  j  calls,  and  note  that  if  Mt  does  not  happen  then  the  adversary’s  probability  of  winning 
is  1/2,  i.e.  the  best  he  can  do  is  guess.  We  have 

Pr[A  wins  CBC [7*]]  =  Pr[A  wins  CBC[7]  |  Mt]  •  Pt[Mt\ 


+  Pr[A  wins  CBC[7]  |  -i Mt]  •  Pr[-iM(p 


(21) 


<  Pt[Mt\  +  Pr[A  wins  CBC [7 

<  Pr[MT]  +  ^ 


i  M' 


T 


So  we  are  left  with  estimating  Clearly,  we  have  Pr[Mi]  =  0,  and  for  all  j  >  1  we  have 


Pr  [Mj]  <  Pr[Mj_i]  +  Pr  [Mj  \  ~>Mj- 1 


from  which  it  follows  that 

i  <  (j  -  1)  ■  (ge  +  1)  •  l 

J  -  2n 


From  which  follows  Pr  [Mt]  A  T2/ 2n. 


o 

This  is  why  CBC  Mode  is  not  secure  in  the  nonce-based  setting  as  in  this  setting  the  adversary  controls  the 
first  input  block,  by  selecting  the  first  block  of  the  message  and  the  IV. 
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Summing  up  we  have 


AdvIND"CPA 


"MU-LrA/  A,  \  _  ^ 

CBC[efe]  W1?  Qe)  —  ^ 


Pr [A  wins  CBC[e^ 

Pr [A  wins  CBC[e^ 

+  (Pr[A  wins  CBC[P]\  —  Pr  [A  wins  CBC  [P]}) 


1 

2 

1 

2 


<  2 


Pr[A  wins  CBC[eJ]  —  Pr[A  wins  CBC [7^]] 


+  2 


Pr[A  wins  CBC [7^]] - 

2 


AdvPRP(.B)  +  2  •  Pr[A  wins  CBC[P]]  -  - 
k  2 

1 


AdvPRP(i?)  +  2  •  Pr[A  wins  CBC[P]\  -  - 

fc  2 

+  (Pr[A  wins  CBCfT7]]  —  Pr[A  wins  CBCfT7]]) 
<  AdvPRP(S) 

Ft[A  wins  CBC [7^]]  —  Pr[A  wins  CBC[7 

1 

~~  2 


efc 

+  2 
+  2 


Pr[A  wins  CBC [B 


<  AdvPRP(S)  +  =-  +  2  •  Pr[A  wins  CBC[J]] 


T 


1 


<  Adve7>)  +  —  +  2  •  Pr[MT] 


T' 


<  AdvPRP(S)  + 

—  efc  v  /  1 


On 

3  •  T2 


adding  in  zero 


triangle  inequality 
by  equation  (19) 


adding  in  zero 


triangle  inequality 
by  equation  (20) 
by  equation  (21) 


□ 


Let  us  examine  what  this  means  when  we  use  the  AES  block  cipher  in  CBC  Mode.  First  the  block 
length  of  AES  is  n  =  128,  and  let  us  assume  the  key  size  is  128  as  well.  If  we  assume  AES  behaves 
as  a  PRP,  then  we  expect  that 


Ad vaes(B)  < 


1 

2128 


for  all  adversaries  B.  We  can  now  work  out  the  advantage  for  any  adversary  A  to  break  AES  when 
used  in  CBC  mode,  in  the  sense  of  IND-CPA.  We  find 


Adv 


IND-CPA 

CBCpLES] 


(A;qe)  < 


1  +  3  •  T2 
2128 


Thus  even  if  the  adversary  makes  230  calls  to  the  underlying  block  cipher,  the  advantage  will  still 
be  less  than 


1  +  3  ■  260 
2128 


r* 


2 


-66 


5 


which  is  incredibly  small.  Thus  as  long  as  we  restrict  the  usage  of  AES  in  CBC  Mode  with  a 
random  IV  to  encrypting  around  230  blocks  per  key  we  will  have  a  secure  cipher.  Restricting  the 
usage  of  a  symmetric  cipher  per  key  is  enabled  by  requiring  a  user  to  generate  a  new  key  every  so 
often. 
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For  all  three  variants  CBC  Mode  is  not  OW-CCA  secure  and  hence  not  IND-CCA  secure,  due  to 
a  similar  attack  to  that  in  Theorem  13.3.  The  OW-CPA  security  of  CBC  Mode,  for  all  three  ways 
of  picking  the  IV,  follows  from  Theorem  13.4  due  to  the  proof  of  the  following  theorem. 

Theorem  13.7.  For  any  method  of  choosing  the  IV,  CBC  mode  is  OW-CPA  secure  assuming  the 
underlying  block  cipher  e &  acts  like  a  pseudo-random  permutation.  In  particular  let  A  denote  an 
adversary  against  CBC  Mode  which  makes  qe  queries  to  its  encryption  oracle  (with  each  query  being 
at  most  I  blocks  in  length),  then  there  is  an  adversary  B  such  that 


Adv 


OW-CPA 

CBC[efc] 


(A;qe)  =  Adv 


OW-CPA 

ECB[efc] 


(B;qe  •  I). 


Proof.  For  uniform  challenge  messages  the  distribution  (for  a  fixed  number  of  blocks)  of  the 
challenges  given  to  adversary  A  and  adversary  B  are  identical.  Algorithm  B  works  as  follows.  We 
let  c*  =  ci  1 1 C2 1 1  •  •  •  denote  the  challenge  ciphertext  passed  to  adversary  B\  we  pass  this  to  adversary 
A  who  returns  the  CBC  Mode  decryption  of  c*,  with  the  IV  being  anything  that  the  given  method 
allows,  we  let  m'^m^W  . . .  be  the  returned  plaintext.  When  algorithm  A  makes  an  encryption 
oracle  request;  this  is  answered  by  calling  algorithm  B’s  encryption  oracle  a  block  at  a  time,  and 
simulating  CBC  Mode.  Then  using  the  known  IV  used  when  passing  c*  to  A ,  algorithm  B  can 
decrypt  c*  under  ECB  Mode  to  obtain  mi||m2||  . . .,  using  the  following  equations: 


m i  =  m[  0  IV, 

mi  =  m[  0  Ci  for  i  >  1. 


□ 


13.4.3.  OFB  Mode:  Output  Feedback  Mode,  or  OFB  Mode,  enables  a  block  cipher  to  be  used 
as  a  stream  cipher.  We  use  the  block  cipher  to  create  the  keystream,  n  bits  at  a  time,  where  n  is 
the  block  size.  Again  we  divide  the  plaintext  into  a  series  of  blocks,  each  block  being  n-bits  long: 


m i, . . . , rnq. 

Encryption  is  performed  as  follows;  see  Figure  13.11  for  a  graphical  representation.  First  we  set 
Vo  IV,  then  for  i  —  1,2 , ...  ,q,  we  perform  the  following  steps, 

V  ek(Xi- 1), 

Ci  i —  mi  0  Vi. 


The  output  ciphertext  is  /V||ci||c2 
manner. 


Decipherment  in  OFB  Mode  is  performed  in  a  similar 


Figure  13.11.  OFB  encipherment  and  decipherment 


13.4.  MODES  OF  OPERATION 


263 


We  now  turn  to  discussing  security  of  OFB  Mode.  Clearly,  when  used  with  a  fixed  IV,  OFB 
Mode  is  not  IND-CPA  secure;  it  turns  out  it  is  not  OW-CPA  secure  either  when  used  with  a  fixed 
IV  as  the  next  result  demonstrates. 

Theorem  13.8.  When  used  with  a  fixed  IV,  OFB  Mode  is  not  OW-CPA  secure. 

Proof.  Call  the  encryption  oracle  on  the  plaintext  consisting  of  all  zero  blocks.  This  reveals 
the  “keystream”,  which  can  then  be  exclusive-or-ed  with  the  challenge  ciphertext  to  produce  the 
required  plaintext.  □ 

Also  on  the  negative  side,  for  any  method  of  selecting  the  IV,  OFB  Mode  is  not  OW-CCA  secure 
and  hence  is  not  IND-CCA  secure;  again  the  “attack”  is  exactly  the  same  as  in  Theorem  13.3.  On 
another  negative  side  we  have  the  following. 

Theorem  13.9.  OFB  Mode  is  not  OW-CPA  secure  in  the  nonce-based  setting,  i.e.  when  the  IV 
is  a  nonce. 


Proof.  The  adversary  first  picks  a  nonce  n  and  asks  the  encryption  oracle  for  an  encryption  of 
| mo 1 1  ....  The  ciphertext  ci 1 1 C2 1 1  •  •  •  will  be  returned,  from  which  the  adversary  can  work 


m!  =  mi 


out  the  keystream  starting  from  IV  n ,  and  hence  also  from  IV,  c\  0  rr!x  —  ek{n).  The  adversary 
now  asks  for  the  challenge  ciphertext  c*  with  nonce  c\  ©mj  =  e^(n).  Using  the  previously  obtained 
keystream  the  adversary  can  recover  the  message  encrypted  by  c*.  □ 


So  we  are  left  to  consider  what  positive  results  hold.  It  also  turns  out  that  the  proof  of  Theorem 
13.6  can  be  applied  to  OFB  Mode  as  well  as  CBC  Mode,  so  we  have  that  OFB  Mode  is  IND-CPA 
secure,  assuming  the  underlying  block  cipher  is  a  secure  pseudo-random  permutation,  when  used 
with  a  random  IV. 


13.4.4.  CFB  Mode:  The  next  mode  we  consider  is  called  Cipher  FeedBack  Mode,  or  CFB  Mode. 
This  is  very  similar  to  OFB  Mode  in  that  we  use  the  block  cipher  to  produce  a  stream  cipher.  Recall 
that  in  OFB  Mode  the  keystream  was  generated  by  encrypting  TV  and  then  iteratively  encrypting 
the  output  from  the  previous  encryption.  In  CFB  Mode  the  keystream  output  is  produced  by  the 
encryption  of  the  ciphertext,  as  in  Figure  13.12,  by  the  following  steps,  upon  setting  Vo  IV , 

Zi  <—  ek(Yi- 1), 

V{  i —  mi  0  Z{. 

We  do  not  present  the  decryption  steps,  but  leave  this  as  an  exercise  for  the  reader.  CFB  mode  has 


Figure  13.12.  CFB  encipherment 


an  interesting  attack  against  it.  Suppose  the  challenge  ciphertext  is  /V||ci||c2||  . . .,  then  if  we  query 
the  encryption  oracle  with  the  zero  plaintext  block,  but  IV  equal  to  c\,  we  will  obtain  a  ciphertext 
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c'  such  that  C2  ®  d  =  m2.  Continuing  in  this  way  we  can  recover  all  the  plaintext  blocks  bar  the 
first  one.  This  clearly  leads  to  an  attack  against  IND  security  for  nonce-based  encryption.  However 
it  does  not  lead  to  an  attack  against  the  full  one-wayness,  as  one  cannot  recover  the  first  block. 

Theorem  13.10.  CFB  Mode  is  not  IND-CPA  secure  when  used  as  a  nonce-based  encryption 
scheme. 

Just  as  with  OFB  Mode,  the  proof  of  Theorem  13.6  can  be  applied,  so  we  have  that  CFB  Mode  with 
a  random  IV  is  IND-CPA  secure  assuming  the  underlying  block  cipher  is  a  secure  pseudo-random 
permutation.  We  can  also  use  the  proof  of  Theorem  13.4,  to  show  that  CFB  mode  is  OW-CPA 
secure  when  used  as  a  nonce-based/random  IV  scheme. 

13.4.5.  CTR  Mode:  The  next  mode  we  consider  is  called  Counter  Mode,  or  CTR  Mode.  This 
combines  many  of  the  advantages  of  ECB  Mode,  but  with  none  of  the  disadvantages.  We  first 
select  a  public  IV ,  or  counter,  which  is  chosen  differently  for  each  message  encrypted  under  the 
fixed  key  k.  Then  encryption  proceeds  for  the  ith  block,  by  encrypting  the  value  of  IV  +  i  and 
then  exclusive-or’ing  this  with  the  message  block.  In  other  words  we  have 

Ci  <—  mi  0  e^{IV  0  (i)n), 

where  (i)  is  the  mbit  representation  of  the  number  i.  This  is  explained  pictorially  in  Figure  13.13. 
An  important  property  to  preserve  security  of  CTR  Mode  is  that  the  counter  input  to  any  of  the 
block  cipher  calls  may  not  be  reused  in  any  subsequent  encryption.  In  the  case  of  a  random  IV 
chosen  by  the  encryptor  this  will  happen  with  overwhelming  probability,  assuming  we  limit  the 
number  of  block  cipher  invocations  with  a  given  key.  In  the  case  of  nonce-based  encryption  we 
simply  have  to  ensure  that  if  we  encrypt  a  t-block  message  with  nonce  IV  =  j,  then  we  never  take 
a  new  nonce  in  the  range  [7 , . . . ,  j  +  t  —  1]. 


IV  VI 


IV  +  2 


IV  0  3 


&k 

&k 

mi  — ►  0 


m2  — ►  0 


m3  — ►  0 


y 

Cl 


y 

C2 


y 

C3 


Figure  13.13.  CTR  encipherment 

CTR  Mode  has  a  number  of  interesting  properties.  Firstly,  since  each  block  can  be  encrypted 
independently,  much  like  in  ECB  Mode,  we  can  process  each  block  at  the  same  time.  Compare 
this  to  CBC  Mode,  OFB  Mode  or  CFB  Mode  where  we  cannot  start  encrypting  the  second  block 
until  the  first  block  has  been  encrypted.  This  means  that  encryption,  and  decryption,  can  be 
performed  in  parallel.  Another  performance  advantage  comes  from  the  fact  that  we  only  ever  apply 
the  encryption  operation  of  the  underlying  block  cipher. 

However,  unlike  ECB  Mode,  two  equal  blocks  will  not  encrypt  to  the  same  ciphertext  value. 
This  is  because  each  plaintext  block  is  encrypted  using  a  different  input  to  the  encryption  function; 
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in  some  sense  we  are  using  the  block  cipher  encryption  of  the  different  inputs  to  produce  a  stream 
cipher.  Also  unlike  ECB  Mode  each  ciphertext  block  corresponds  to  a  precise  position  within  the 
ciphertext,  as  one  needs  the  position  information  to  be  able  to  decrypt  it  successfully. 

In  terms  of  security  properties,  the  security  proof  for  IND-CPA  security  is  actually  simpler  than 
that  used  for  CBC  Mode.  The  reason  is  that  with  the  above  restrictions  on  the  reuse  of  TV s  we 
never  have  to  worry  about  an  input  collision  for  the  block  cipher.  Thus,  assuming  the  total  number 
of  blocks  encrypted  with  a  fixed  key  is  kept  relatively  low,  we  have  the  following. 

Theorem  13.11.  CTR  Mode  is  IND-CPA  secure  assuming  the  underlying  block  cipher  e &  acts 
like  a  pseudo-random  permutation.  In  particular  let  A  denote  an  adversary  against  CTR  Mode 
which  makes  qe  queries  to  its  encryption  oracle,  and  let  all  plaintext  submitted  to  both  the  LR  and 
encryption  oracles  be  at  most  I  blocks  in  length.  There  is  then  an  adversary  B  such  that 

AdvCTR[e*f  (A  «e)  <  AdvePfcRP(S)  +  F, 
where  n  is  the  block  size  of  the  cipher  e &  and  T  =  (qe  +  1)  •  I. 

With  the  above  restrictions  on  the  reuse  of  the  nonce,  we  obtain  an  IND-CPA  secure  nonce-based 
encryption  scheme  with  exactly  the  same  advantage  statement. 

All  is  not  totally  positive  however;  the  attack  in  Theorem  13.8  also  applies  to  CTR  mode  and 
hence  the  scheme  is  not  OW-CPA  secure  when  used  with  a  fixed  IV.  In  addition  we  cannot  achieve 
CCA  security,  again  due  to  the  attack  presented  in  Theorem  13.3. 


Mode 

1 — 1 

< 

OW-PASS 

OW-CPA 

OW-CCA 

ECB  Mode 

- 

OW-CPA  =>  / 

Thm  13.4  — >  V 

Thm  13.3  =>•  X 

CBC  Mode 

fixed 

nonce 

random 

IND-PASS  =>  / 

IND-PASS  =>  / 

IND-PASS  =>  / 

Thm  13.7  — >  / 

Thm  13.7  =>  / 

IND-CPA  =>  / 

Prf  of  Thm  13.3  — >  X 

Prf  of  Thm  13.3  X 

Prf  of  Thm  13.3  X 

OFB  Mode 

fixed 

nonce 

random 

IND-PASS  =>  / 

IND-PASS  ==>  / 

IND-PASS  =>  / 

Thm  13.8  — >  X 

Thm  13.9  =>  X 

IND-CPA  =>  / 

Prf  of  Thm  13.3  — >  X 

Prf  of  Thm  13.3  X 

Prf  of  Thm  13.3  X 

CFB  Mode 

fixed 

nonce 

random 

IND-PASS  ==>  / 

IND-PASS  ==>  / 

IND-PASS  =>  / 

(nonce  OW-CPA)  =>  / 
Prf  of  Thm  13.4  =>  V 

IND-CPA  =>  / 

Prf  of  Thm  13.3  — >  X 

Prf  of  Thm  13.3  X 

Prf  of  Thm  13.3  X 

CTR  Mode 

fixed 

nonce 

random 

IND-PASS  =>  / 

IND-PASS  =>  / 

IND-PASS  =>  / 

Prf  of  Thm  13.8  =>  X 

IND-CPA  =>  / 

IND-CPA  =>  / 

Prf  of  Thm  13.3  — >  X 

Prf  of  Thm  13.3  X 

Prf  of  Thm  13.3  X 

Table  13.1.  One-way  security  properties  of  the  five  basic  modes  of  operation 


We  summarize  our  results  on  modes  of  operation  in  Tables  13.1  and  13.2.  Against  each  tick 
or  cross  we  give  the  theorem  number  which  presents  this  result,  or  the  proof  of  the  theorem  which 
can  be  modified  to  give  the  result.  For  nonce-based  CTR  mode  we  assume  the  convention  with 
respect  to  nonces  described  earlier.  To  derive  most  of  the  tables  we  note  the  implications  given  in 
Figure  11.18;  e.g.  (^OW-XXX)  =>  (MND-XXX)  and  (equivalently)  (IND-XXX)  =>  (OW-XXX), 
also  (IND-CPA)  ==>  (IND-PASS).  We  also  note  that  in  the  passive  security  setting  there  is  no 
difference  between  the  nonce-based  and  fixed  IV  variants,  and  if  a  scheme  is  IND-PASS  secure  in 
the  random-IV  setting,  then  it  is  also  IND-PASS  secure  in  the  nonce-based  setting. 
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Mode 

1— 1 

< 

IND-PASS 

IND-CPA 

IND-CCA 

ECB  Mode 

- 

Thm  13.2  =>  X 

(-.  IND-PASS)  =>  X 

(-.  IND-PASS)  =>  X 

CBC  Mode 

fixed 

nonce 

random 

(nonce  IND-PASS)  =>  / 
(random  IND-PASS)  =>  / 
IND-CPA  / 

Trivially  X 

Thm  13.5  =>  X 

Thm  13.6  =>*  / 

Trivially  X 

(-.  OW-CCA)  =>  X 
(-.  OW-CCA)  =>  X 

OFB  Mode 

fixed 

nonce 

random 

(nonce  IND-PASS)  =>  / 
(random  IND-PASS)  =>  / 
IND-CPA  / 

Trivially  X 

(-.  OW-CPA)  —  X 
Prf  of  Thm  13.6  =>  / 

Trivially  X 

(-.  OW-CCA)  =>  X 
(-.  OW-CCA)  =>  X 

CFB  Mode 

fixed 

nonce 

random 

(nonce  IND-PASS)  =>  / 
(random  IND-PASS)  =>  / 
IND-CPA  =>  / 

Trivially  X 

Thm  13.10  =>-  X 

Prf  of  Thm  13.6  =>  / 

Trivially  X 

(-.  OW-CCA)  =>  X 
(-.  OW-CCA)  =>  X 

CTR  Mode 

fixed 

nonce 

random 

(nonce  IND-PASS)  =>  / 
IND-CPA  =>  / 

IND-CPA  =>  / 

Trivially  X 

Thm  13.11  =>  / 

Thm  13.11  =>*  / 

Trivially  X 

(-.  OW-CCA)  =>  X 
(-.  OW-CCA)  =>  X 

Table  13.2.  Indistinguishability  security  properties  of  the  five  basic  modes  of  operation 


13.5.  Obtaining  Chosen  Ciphertext  Security 

All  the  prior  modes  of  operation  did  not  provide  security  against  chosen  ciphertext  attacks.  To  do 
this  one  needs  more  advanced  modes  of  operation,  called  authenticated  encryption  modes.  There 
are  a  number  of  modes  which  provide  this  property  for  symmetric  encryption  based  on  block  ciphers, 
for  example  CCM  Mode,  GCM  Mode  and  OCB  Mode.  There  is  little  room  in  this  book  to  cover 
these  modes,  however  we  will  present  a  simple  technique  to  obtain  an  IND-CCA  secure  symmetric 
cipher  from  one  of  the  previous  IND-CPA  secure  modes  of  operation,  called  Encrypt-then-MAC. 

Let  (E&,  Dk)  denote  an  IND-CPA  symmetric  encryption  scheme,  say  CBC  Mode  or  CTR  Mode 
instantiated  with  AES.  Let  Ki  denote  the  key  space  of  the  encryption  scheme.  The  problem  with  the 
previous  chosen  ciphertext  attacks  was  that  an  adversary  could  create  a  new  ciphertext  without 
needing  to  know  the  underlying  key.  So  the  decryption  oracle  would  decrypt  any  old  garbage 
which  the  adversary  threw  at  it.  The  trick  to  obtaining  CCA  secure  schemes  is  to  ensure  that  the 
decryption  oracle  rejects  almost  all  of  the  ciphertexts  that  the  adversary  creates;  in  fact  we  hope 
it  rejects  all  bar  the  ones  validly  produced  by  an  encryption  oracle. 


13.5.1.  Encrypt-then-MAC:  The  standard  generic  method  of  adding  this  form  of  integrity 
protection  to  the  ciphertext  is  to  append  a  message  authentication  code  to  the  ciphertext.  We  let 
(Macfc,  Verify^)  denote  a  message  authentication  code  with  key  space  K2.  In  the  next  chapter  we  will 
see  how  such  codes  can  be  constructed,  but  for  now  just  assume  we  can  create  one  which  satisfies  the 
security  definition  of  strong  existential  unforgeability  under  chosen  message  attacks,  from  Chapter 
11.  We  then  construct  our  CCA  secure  symmetric  encryption  scheme,  called  Encrypt-then-MAC 
(ETM),  as  follows: 

KeyGen():  Sample  k\  <—  Ki  and  ^—5^2,  return  k  =  (Aq,  fo). 

Encjfc(m):  Compute  the  ciphertext  c\  <—  E/Cl(m),  then  form  a  MAC  on  the  ciphertext  from  C2  <— 
Macfc2(ci).  Return  the  ciphertext  c—  (ci,C2). 

Dec k(c):  Verify  the  MAC  and  decrypt  the  ciphertext:  v  <—  Verify^ (c2, ci),  and  m  <—  D^ci).  If 
v  =  valid  then  return  m,  else  return  T. 
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Note  that  the  message  authentication  code  is  applied  to  the  ciphertext  and  not  the  plaintext; 
this  is  very  important  to  maintain  security.  Also  note  that,  when  decrypting,  we  decrypt  the  first 
ciphertext  component  ci,  irrespective  of  whether  the  MAC  in  C2  verifies  or  not;  this  is  to  avoid  subtle 
timing  attacks.  We  now  show  that  this  construction  is  IND-CCA  secure,  assuming  the  underlying 
encryption  scheme  and  message  authentication  code  are  suitably  secure. 

Theorem  13.12.  Let  A  denote  an  adversary  against  the  ETM  scheme  constructed  from  the  en¬ 
cryption  scheme  Eh  =  (Enc,Dec)  and  the  message  authentication  code  Eh  =  (M,  V),  then  there 
are  two  adversaries  B\  and  B 2  such  that 

Ad4NTDMCCA(Y  <  Advjj1D"CPA(Bi)  +  AdvfIE2UF-CMA(B2). 

Proof.  Let  E  denote  the  event  that  the  adversary  produces  a  ciphertext,  which  is  not  the  output 
of  a  call  to  the  encryption  oracle,  which  when  passed  to  the  decryption  oracle  results  in  the  output 
of  something  other  than  _L.  We  have 

Pr[A  wins]  =  Pr[A  wins  A  -*E\  •  Pt[~^E]  +  Pr[A  wins  A  E\  •  Pt[E]  <  Pr[A  wins  A  —iE\  +  Pr [E]. 

Now  Pr[A  wins  A  —>E]  is  the  same  probability  as  running  A  in  a  IND-CPA  attack,  since  if  E  does 
not  happen  then  A  does  not  learn  anything  from  its  decryption  queries,  and  all  decryption  oracle 
queries  can  be  answered  by  either  returning  _L,  or  remembering  what  was  queried  to  the  encryption 
oracle.  Hence  in  this  case  we  can  take  A  to  be  the  B\  in  the  theorem,  and  just  ignore  any  decryption 
queries  which  the  adversary  A  makes  which  do  not  correspond  to  the  adversary’s  outputs  from  the 
encryption  oracle. 

If  event  E  happens  then  the  adversary  A  must  have  created  a  ciphertext,  which  is  different  from 
the  challenge  ciphertext,  whose  MAC  verifies.  In  such  a  situation  we  can  create  an  adversary  B 2 
which  breaks  the  message  authentication  code  as  follows.  We  create  B2  by  using  A  as  a  subroutine. 

•  Generate  k\  Ki. 

•  Call  A. 

•  When  A  makes  a  query  to  the  encryption  oracle,  use  the  key  k\  to  create  a  first  ciphertext 
component,  and  then  use  BAs  oracle  Ouzck  to  create  the  second  ciphertext  component. 

•  When  A  makes  a  query  to  the  O lr  oracle,  pick  the  random  bit  b  and  proceed  as  for  the 
Oek  oracle. 

•  When  A  makes  a  query  to  its  decryption  oracle  we  first  check  whether  the  ciphertext 
verifies  (using  the  verification  oracle  provided  to  £>2).  If  it  does,  and  the  ciphertext  is  not 
the  output  of  Oek  then  stop  and  return  the  input  as  a  MAC  forgery.  Otherwise  proceed 
as  in  the  real  game. 

Let  (ci,  C2)  denote  the  output  of  adversary  P>2-  We  note  that  one  of  c\  and  C2  must  be  different  from 
the  output  of  Oek  (by  definition  of  £>2)  and  (9|_r  (by  the  rules  A  is  following).  If  the  difference  is  c\ 
then  C2  is  a  valid  MAC  on  a  message  which  has  not  been  queried  to  BAs  Owack  oracle.  Whereas 
if  ci  is  the  same  as  a  previous  output  from  Oek  or  Ok,  and  C2  is  different,  then  B2  has  managed 
to  produce  a  strong  forgery.  Thus  in  either  case  a  MAC  forgery  has  been  created  and  the  result 
follows.  So  Pt[E]  is  bound  by  the  advantage  of  B2  in  winning  the  forgery  game  for  the  MAC 
function.  □ 

Note  that  the  same  construction  can  be  used  to  construct  a  CCA-secure  DEM,  i.e.  a  data 
encapsulation  mechanism.  Recall  that  this  is  a  symmetric  encryption  scheme  for  which  only  one 
message  is  ever  created  with  a  given  key,  but  for  which  the  adversary  has  a  decryption  oracle.  Using 
the  same  technique  as  in  the  proof  above  we  can  prove  the  following. 

Theorem  13.13.  Let  A  denote  an  adversary  against  the  ETM  scheme  constructed  from  the  en¬ 
cryption  scheme  Hi  =  (Enc,Dec)  and  the  message  authentication  code  H2  =  (M,V),  then  there 
are  two  adversaries  B\  and  B2  such  that 

Adv|txlMD"CCA(A)  <  Adv^D-pASS(Bi)  +  AdvfUEUF-CMA(£2). 
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Note  that  we  only  require  a  passively  secure  encryption  scheme  and  a  one-time  secure  message 
authentication  code.  Hence  we  could,  to  construct  a  DEM,  use  CBC  Mode  with  a  fixed  IV.  This 
means  that  the  ciphertext  for  a  DEM  can  be  one  block  shorter  than  for  a  general  encryption  scheme, 
as  we  no  longer  need  to  transmit  the  IV.  The  use  of  DEMs  will  become  clearer  when  we  discuss 
hybrid  encryption  in  a  later  chapter. 

13.5.2.  Encrypt-and-MAC:  We  stressed  above  that  it  is  important  that  one  authenticates  the 
ciphertext  and  not  the  message.  One  popular  method  in  the  past  for  trying  to  produce  a  CCA  secure 
symmetric  encryption  scheme  was  to  use  a  method  called  Encrypt-and-MAC.  Here  one  applies  a 
MAC  to  the  plaintext  and  then  appends  this  MAC  to  the  ciphertext.  Thus  we  have 

KeyGen():  Sample  Aq  <—  Ki  and  ^—1^2,  return  k  =  (Aq,  A^). 

Enc fc(m):  Compute  the  ciphertext  c\  <—  Ekl(m),  then  form  a  MAC  on  the  plaintext  from  C2  <— 
MaCfc2 (m).  Return  the  ciphertext  c—  (ci,C2). 

Dec fc(c):  Decrypt  the  ciphertext  and  then  verify  the  MAC:  m  <—  Dkl(c\).  v  <—  Verify^ (c2, m),  If 
v  =  valid  then  return  m,  else  return  _L. 


The  problem  is  that  this  method  is  not  generically  secure.  By  this  we  mean  that  its  security  depends 
on  the  precise  choice  of  the  IND-CPA  encryption  scheme  and  MAC  which  one  selects.  In  particular 
the  scheme  is  not  secure  when  instantiated  with  any  of  the  standard  MAC  functions  we  will  discuss 
in  Chapter  14,  as  the  following  result  shows. 

Theorem  13.14.  Encrypt-and-MAC  is  not  IND-CPA  secure  when  instantiated  with  an  IND-CPA 
encryption  scheme  and  a  deterministic  MAC. 

Proof.  The  attack  is  to  pick  two  plaintext  messages  mo  and  mi  of  the  same  length  and  pass  these 
to  the  C\r  oracle  to  obtain  a  challenge  ciphertext  c*  =  (c*,  dj).  Now  the  adversary  passes  mo  to  its 
encryption  oracle  to  obtain  a  new  ciphertext  c  =  (ci,C2).  As  the  MAC  is  deterministic,  if  C2  =  c\ 
then  the  hidden  bit  is  zero,  and  if  not  the  hidden  bit  is  one.  □ 

Notice  that  Encrypt-and-MAC  is  less  secure  (in  the  sense  of  the  IND-CPA  notion)  than  the  original 
encryption  algorithm  without  the  MAC! 

13.5.3.  MAC-then-Encrypt:  Another  method  which  has  been  used  in  protocols  in  the  past,  but 
which  again  is  not  secure  in  general  is  called  MAC-then-Encrypt.  Here  we  MAC  the  plaintext  and 
then  encrypt  the  plaintext  and  the  MAC  together.  Thus  we  have 


KeyGen():  Sample  Aq  <—  ~Ki  and  k 2  K2,  return  k  =  (Aq,  A^). 

Encfc(m):  Form  a  MAC  on  the  plaintext  from  t  <—  Mac/^m).  Compute  the  ciphertext  c 


Ekl(m\\t). 

Dec k(c):  Decrypt  the  ciphertext  and  then  verify  the  MAC:  m\\t  <—  Dkl(c\).  v  <—  Verify^ (£, m).  If 


v 


valid  then  return  m,  else  return  _L. 


Again  this  method  is  not  generically  secure,  since  we  have  the  following. 

Theorem  13.15.  Encryption  via  the  MAC-then-Encrypt  method  may  not  be  IND-CPA  secure  when 
instantiated  with  an  IND-CPA  encryption  scheme  and  EUF-CMA  secure  MAC. 

Proof.  We  present  an,  admittedly  contrived,  example  although  more  complex  real-life  examples  do 
exist.  Take  an  IND-CPA  encryption  scheme  (Ekl  Dk)  and  modify  it  to  form  the  following  encryption 
scheme,  which  we  shall  denote  by  (E'k,D'k).  To  encrypt  one  performs  outputs  Ek(m)  1 1 0,  i.e.  one 
adds  a  zero  bit  onto  the  ciphertext  output  by  Ek.  To  decrypt  one  ignores  the  zero  bit  at  the  end, 
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one  does  not  even  bother  to  check  it  is  zero,  and  decrypts  the  main  component  using  D It  is  clear 
that,  since  (Ek,Dk)  is  IND-CPA  secure,  so  is  (E'k,D'k). 

Now  form  the  MAC-then-Encrypt  cipher  using  (E'k,D'k)  and  any  EUF-CMA  secure  MAC. 
Consider  a  challenge  ciphertext  c*;  this  will  end  in  zero.  Now  we  can  change  this  zero  to  a  one,  to 
create  a  new  ciphertext  c,  which  is  different  from  c*.  Now  since  the  decryption  algorithm  does  not 
check  the  last  bit,  the  decryption  oracle  will  decrypt  c  to  reveal  the  message  underlying  c*  and  the 
associated  MAC  will  verify.  Thus  this  MAC-then-Encrypt  scheme  is  not  even  OW-CCA  secure.  □ 


Chapter  Summary 


•  Probably  the  most  famous  block  cipher  is  DES,  which  is  itself  based  on  a  general  design 
called  a  Feistel  cipher. 

•  A  comparatively  recent  block  cipher  is  the  AES  cipher,  called  Rijndael. 

•  Both  DES  and  AES  obtain  their  security  by  repeated  application  of  simple  rounds  con¬ 
sisting  of  substitution,  permutation  and  key  addition. 

•  To  use  a  block  cipher  one  needs  to  also  specify  a  mode  of  operation.  The  simplest  mode 
is  ECB  mode,  which  has  a  number  of  problems  associated  with  it.  Hence,  it  is  common 
to  use  a  more  advanced  mode  such  as  CBC  or  CTR  mode. 

•  Some  block  cipher  modes,  such  as  CFB,  OFB  and  CTR  modes,  allow  the  block  cipher  to 
be  used  as  a  stream  cipher. 

•  To  obtain  an  IND-CCA  secure  scheme  one  can  use  Encrypt-then-MAC. 


Further  Reading 

The  Rijndael  algorithm,  the  AES  process  and  a  detailed  discussion  of  attacks  on  block  ciphers 
and  Rijndael  in  particular  can  be  found  in  the  book  by  Daemen  and  Rijmen.  Stinson’s  book  is 
the  best  book  to  explain  differential  cryptanalysis  for  students.  For  a  discussion  of  how  to  combine 
encryption  functions  and  MAC  functions  to  obtain  IND-CCA  secure  encryption  see  the  paper  by 
Bellare  and  Namprempre. 

M.  Bellare  and  C.  Namprempre.  Authenticated  encryption:  Relations  among  notions  and  analysis 
of  the  generic  composition  paradigm.  Advances  in  Cryptology  -  Asiacrypt  2000,  LNCS  1976,  531 
545,  Springer,  2000. 

J.  Daemen  and  V.  Rijmen.  The  Design  of  Rijndael:  AES  -  The  Advanced  Encryption  Standard. 
Springer,  2002. 

D.  Stinson.  Cryptography  Theory  and  Practice.  Third  Edition.  CRC  Press,  2005. 


CHAPTER  14 


Hash  Functions,  Message  Authentication  Codes  and  Key 

Derivation  Functions 


Chapter  Goals 

•  To  understand  the  properties  of  keyed  and  unkeyed  cryptographic  hash  functions. 

•  To  understand  how  existing  deployed  hash  functions  work. 

•  To  examine  the  workings  of  message  authentication  codes. 

•  To  examine  how  key  derivation  functions  are  constructed  from  hash  functions  and  message 
authentication  codes. 


14.1.  Collision  Resistance 

A  cryptographic  hash  function  H  is  a  function  which  takes  arbitrary  length  bit  strings  as  input 
and  produces  a  fixed- length  bit  string  as  output;  the  output  is  often  called  a  digest,  hashcode  or 
hash  value.  Hash  functions  are  used  a  lot  in  computer  science,  but  the  crucial  difference  between 
a  standard  hash  function  and  a  cryptographic  hash  function  is  that  a  cryptographic  hash  function 
should  at  least  have  the  property  of  being  one-way.  In  other  words  given  any  string  y  from  the 
codomain  of  77,  it  should  be  computationally  infeasible  to  find  any  value  x  in  the  domain  of  H 
such  that 

H(x)  =  y. 

Another  way  to  describe  a  hash  function  which  has  the  one-way  property  is  that  it  is  preimage 
resistant.  Given  a  hash  function  which  produces  outputs  of  t  bits,  we  would  like  a  function  for 
which  finding  preimages  requires  0(2*)  time.  Thus  the  one-way  property  should  match  our  one-way 
function  security  game  in  Figure  11.5. 

A  cryptographic  hash  function  should  also  be  second  preimage  resistant.  This  is  the  property 
that  given  m  it  should  be  hard  to  find  an  m'  /  m  with  H(mr )  =  H(m).  The  security  game  for 
second  preimage  resistance  is  given  in  Figure  14.1.  In  particular  a  cryptographic  hash  function 
with  t-bit  outputs  should  require  about  2*  queries  before  one  can  find  a  second  preimage.  Thus  we 
define  the  advantage  of  an  adversary  to  break  second  preimage  resistance  of  a  function  H  to  be 

Adv^d"Preimage(A)  =  Pr[A  wins  the  2nd- Preimage  game  . 

We  say  a  function  H  is  second  preimage  resistant  if  the  advantage  is  “small”  (i.e.  about  1/2*)  for 
all  adversaries  A. 

In  practice  we  need  in  addition  a  property  called  collision  resistance.  This  is  a  much  harder 
property  to  define,  and  as  such  we  give  three  definitions,  all  of  which  are  used  in  this  book,  and 
in  practice.  First  consider  a  function  H  mapping  elements  in  a  domain  D  to  a  codomain  C . 
For  collision  resistance  to  make  any  sense  we  assume  that  the  domain  D  is  much  larger  than  the 
codomain  C.  In  particular  D  could  be  the  set  of  arbitrary  length  bit  strings  {0, 1}*.  A  function  is 
called  collision  resistant  if  it  is  infeasible  to  find  two  distinct  values  m  and  m'  such  that 

H{m)  =  H(m!). 
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H  - - 

rn  D  - * 

/  A 

m  - 

Win  if  H{m')  =  H(m) 

and  m!  ^  rn  _ 

Figure  14.1.  Security  game  for  second  preimage  resistance 

Pictorially  this  is  described  in  Figure  14.2. 


H  - - 

/  A 

m,  rn  -* - 

Win  if  H(m)  =  H(m') 

and  m  7^  m'  _ 

Figure  14.2.  Security  game  for  collision  resistance  of  a  function 

The  problem  with  this  definition  is  that  we  cannot  define  security.  Recall  that  we  say  something 
is  secure  if  the  probability  of  winning  the  security  game  is  “small” ,  for  all  possible  adversaries  A. 
However,  there  is  a  trivial  adversary  which  breaks  the  above  game  for  a  given  pre-specified  function 
FT,  namely 

•  Return  m  and  m'  such  that  H(m)  =  H(mr). 

Since  D  is  much  bigger  than  C  we  know  such  a  pair  (m,m')  must  exist,  and  so  must  the  above 
adversary. 

This  looks  exactly  like  the  issue  we  covered  when  we  discussed  pseudo-random  functions  in 
Chapter  11,  and  indeed  it  is.  There  we  got  around  this  problem  by  assuming  the  function  was 
taken  from  a  family  of  functions,  indexed  by  a  key.  In  particular,  the  adversary  did  not  know  which 
function  from  the  family  would  be  selected  ahead  of  time.  Because  the  number  of  functions  in  the 
family  was  exponentially  large,  one  could  not  write  down  a  polynomial-time  adversary  like  the  one 
above  for  a  family. 

However,  the  use  of  unkeyed  functions  which  are  collision  resistant  is  going  to  be  really  im¬ 
portant,  so  we  need  to  be  able  to  argue  about  them  despite  this  definitional  problem.  So  let  us 
go  back  to  our  intuition:  What  we  mean  when  we  say  a  function  is  collision  resistant  is  that  we 
do  not  think  the  trivial  adversary  above  can  be  found  by  the  adversary  attacking  our  system.  In 
other  words  whilst  we  know  that  the  trivial  adversary  exists,  we  do  not  think  it  humanly  possible 
to  construct  it.  We  thus  appeal  to  a  concept  which  Rogaway  calls  human  ignorance.  We  cannot 
define  an  advantage  statement  for  such  functions  but  we  can  define  a  notion  of  security. 

Definition  14.1.  A  function  H  is  said  to  be  collision  resistant  (by  human  ignorance)  or  HI-CR 
secure  if  it  is  believed  to  be  infeasible  to  write  down  a  collision  for  the  function,  i.e.  two  elements 
in  the  domain  mapping  to  the  same  element  in  the  codomain. 

It  is  harder  to  construct  collision  resistant  hash  functions  than  one-way  hash  functions  due  to  the 
birthday  paradox.  To  find  a  collision  of  a  hash  function  H,  we  can  keep  computing 

H(rri2 ),  . . . 
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until  we  get  a  collision.  If  the  function  has  an  output  size  of  t  bits  then  the  probability  of  obtaining 
a  collision  after  q  queries  to  H  is  q2  / 2t+1.  So  we  expact  to  obtain  a  collision  after  about  V2t+1 
queries.  This  should  be  compared  with  the  number  of  steps  needed  to  find  a  preimage,  which 
should  be  about  2t  for  a  well-designed  hash  function.  Hence  to  achieve  a  security  level  of  128  bits 
for  a  collision  resistant  hash  function  we  need  roughly  256  bits  of  output. 

Whilst  the  above,  somewhat  disappointing,  definition  of  collision  resistance  is  useful  in  many 
situations,  in  other  situations  a  more  robust,  and  less  disappointing,  definition  is  available.  Here  we 
consider  a  family  of  functions  the  challenger  picks  a  given  function  from  the  family  and  then 

asks  the  adversary  to  find  a  collision.  We  can  define  two  security  games,  one  where  the  adversary  is 
given  access  to  the  function  via  an  oracle  (see  Figure  14.3)  and  one  where  the  adversary  is  actually 
given  the  key  (see  Figure  14.4). 


y  <-  fk{x) 


Figure  14.3.  Security  game  for  weak  collision  resistance  of  a  family  of  functions 


{fkh c  - 

k  <—  KeyGenQ 


mi,  m2 


Win  if  /fe(mi)  =  fk(m2) 
and  mi  /  m2 


Figure  14.4.  Security  game  for  collision  resistance  of  a  family  of  functions 
We  define  the  advantages  Adv^^j  ( A )  and  Adv^y^  (A)  in  the  usual  way  as 

ip 

Adv|jfc|  ( A )  =  Pr[A  wins  the  collision  resistance  game], 

/•—  rp 

Adv^i  (A)  =  Pr[A  wins  the  weak  collision  resistance  game  . 

We  say  that  the  family  is  CR  secure  (resp.  wCR  secure)  if  the  advantage  is  “small”  for  all  adversaries 

A.  For  a  good  such  function  we  hope  that  the  advantage  of  the  adversary  in  finding  a  collision  is 

2 

given  by  the  birthday  bound,  i.e.  where  q  models  the  number  of  queries  A  makes  to  the  oracle 
Ofk ,  or  essentially  its  running  time  in  the  case  of  collision  resistance.  Such  keyed  function  families 
are  often  called  (keyed)  collision  resistant  hash  functions. 

In  summary  a  cryptographic  hash  function  needs  to  satisfy  the  following  three  properties: 

(1)  Preimage  Resistant:  It  should  be  hard  to  find  a  message  with  a  given  hash  value. 

(2)  Second  Preimage  Resistant:  Given  one  message  it  should  be  hard  to  find  another 
message  with  the  same  hash  value. 
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(3)  Collision  Resistant:  It  should  be  hard  to  find  two  messages  with  the  same  hash  value. 

Note  that  the  first  two  properties  can  also  be  applied  to  function  families.  We  leave  the  respective 
definitions  to  the  reader. 

We  can  relate  these  three  properties  using  reductions;  note  that  the  argument  in  the  proof  of 
the  next  lemma  applies  to  collision  resistance  in  any  of  the  three  ways  we  have  defined  it. 

Lemma  14.2.  Assuming  a  function  H  is  preimage  resistant  for  every  element  of  the  range  of  H 
is  a  weaker  assumption  than  assuming  it  is  either  collision  resistant  or  second  preimage  resistant. 

Proof.  Suppose  H  is  a  function  and  let  O  denote  an  oracle  which  on  input  of  y  finds  an  x  such 
that  H(pc)  =  y,  i.e.  O  is  an  oracle  which  breaks  the  preimage  resistance  of  the  function  H .  Using 
O  we  can  then  find  a  collision  in  H  by  choosing  x  at  random  and  then  computing  y  =  H(x). 
Passing  y  to  the  oracle  O  will  produce  a  value  x'  such  that  y  =  H(x').  Since  H  is  assumed  to  have 
(essentially)  infinite  domain,  it  is  unlikely  that  we  have  x  —  x' .  Hence,  we  have  found  a  collision  in 
H .  A  similar  argument  applies  to  breaking  the  second  preimage  resistance  of  H.  □ 


We  can  construct  hash  functions  which  are  collision  resistant  but  are  not  one-way  for  some  of  the 
range  of  H.  As  an  example,  let  g(x)  denote  a  HI-CR  secure  function  with  outputs  of  bit  length  n. 
Now  define  a  new  hash  function  H(x)  with  output  size  n  +  1  bits  as  follows: 


H(x)  <- 


0||x  If  \x\  =  71, 
1||  g(x)  Otherwise. 


The  function  H(x)  is  clearly  still  HI-CR  secure,  as  we  have  assumed  g(x)  is  HI-CR  secure.  But  the 
function  H(x)  is  not  preimage  resistant  as  one  can  invert  it  on  any  value  in  the  range  which  starts 
with  a  zero  bit.  So  even  though  we  can  invert  the  function  H(x)  on  some  of  its  input  we  are  unable 
to  find  collisions. 

Lemma  14.3.  Assuming  a  function  is  second  preimage  resistant  is  a  weaker  assumption  than 
assuming  it  is  collision  resistant. 


Proof.  Assume  we  are  given  an  oracle  O  which  on  input  of  x  will  find  x'  such  that  x  ^  x'  and 
H(x)  =  H(x').  We  can  clearly  use  O  to  find  a  collision  in  H  by  choosing  x  at  random.  □ 


Another  use  of  hash  functions  in  practice  will  be  for  deriving  keys  from  so-called  keying  material. 
When  used  in  this  way  we  say  the  function  is  a  key  derivation  function,  or  KDF1.  A  key  derivation 
function  should  act  much  like  a  PRF,  except  we  now  deal  with  arbitrary  length  inputs  and  outputs. 
Thus  a  KDF  acts  very  much  like  a  stream  cipher  with  a  fixed  IV.  We  think  of  a  keyed  KDF  Gk  as 
taking  a  length  t,  where  I  defines  how  many  bits  of  output  are  going  to  be  produced  for  this  key 
k.  The  key  size  for  a  KDF  should  also  be  variable,  in  that  it  can  be  drawn  from  any  distribution 
of  the  challenger’s  choosing.  Thus  in  our  security  game,  in  Figure  14.5,  the  challenger  picks  the 
distribution  K  and  passes  this  to  the  KeyGen()  function  and  the  adversary.  In  the  game  the  oracle 
Ock  can  only  be  called  once. 

In  some  sense  we  have  already  seen  a  KDF  when  we  discussed  CTR  Mode;  however  for  CTR 
Mode  the  key  size  was  limited  to  the  key  size  of  the  underlying  block  cipher,  the  output  values  of 
x  in  the  game  were  limited  to  the  size  of  the  input  block,  and  the  output  size  had  to  be  a  multiple 
of  a  block  length.  So  whilst  CTR  Mode  seems  to  give  us  all  the  security  properties  we  require,  the 
functionality  is  rather  limited.  Despite  this  we  will  see  later  how  to  utilize  CTR  Mode  to  give  us 
precisely  the  KDFs  we  require. 


We  will  see  that  we  can  constuct  KDFs  from  other  primitives,  and  not  only  hash  functions. 
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®Gk  ►  If  b  =  0  then  y  <—  {0, 1} 
else  y  <—  Gk(x,  t) 

-  y 


Win  if  b'  =  b 


Pick  K 

/6N 

*><-{0,1} 

k  <r-  KeyGen(IK) 

A 

b'  - - 

Figure  14.5.  The  security  game  for  a  KDF 


We  end  this  section  with  a  final  remark  on  how  should  we  think  about  the  security  of  (unkeyed) 
hash  functions.  The  method  adopted  in  much  of  practical  cryptography  is  to  assume  that  an 
unkeyed  hash  functions  acts  like  a  truly  random  function,  and  that  (despite  the  adversary  having 
the  code  of  H )  it  therefore  “acts”  like  a  random  oracle  (see  Section  11.9).  In  practice  we  define 
our  cryptographic  schemes  and  protocols  assuming  a  random  oracle,  and  then  replace  the  random 
oracle  in  the  real  protocol  by  a  fixed  hash  function.  Whilst  not  a  perfect  methodology,  this  does 
work  in  most  instances  and  results  in  relatively  simple  final  schemes  and  protocols. 

14.2.  Padding 

In  Chapter  13  we  skipped  over  discussing  how  to  pad  a  message  to  a  multiple  of  the  block  length; 
this  is  done  via  a  padding  scheme.  In  this  chapter  padding  schemes  will  be  more  important,  because 
we  want  to  deal  with  arbitrary  length  messages,  but  the  primitives  from  which  we  will  build  things 
from  only  take  a  fixed- length  input,  or  an  input  which  is  a  multiple  of  a  fixed  length.  As  when  we 
discussed  block  ciphers  we  will  call  this  fixed  length  the  block  size,  and  denote  it  by  b.  We  will 
assume  for  simplicity  that  b  >  64  in  what  follows. 

Now  given  an  input  message  m  which  is  t  bits  in  length  we  will  want  to  make  it  a  message  of 
k  •  b  bits  in  length.  This  is  done  by  padding,  or  extending,  the  message  by  adding  extra  bits  onto 
the  message  until  it  is  of  the  required  length.  However,  there  are  many  ways  of  doing  this;  we  shall 
outline  five.  As  a  notation  we  write  for  the  padded  message 

rallpad^dral,  b) 

where  i  refers  to  the  padding  scheme  (see  below),  and  the  padding  function  takes  as  input  the 
length  of  the  message  and  the  block  size  we  need  to  pad  to.  We  always  assume  (in  this  book)  that 
we  want  to  pad  to  the  next  available  block  boundary  given  the  message  length  and  the  padding 
scheme,  and  that  the  padding  will  be  applied  at  the  end  of  the  message.  This  last  point  is  not 
needed  in  theory,  and  indeed  in  theory  one  can  obtain  very  efficient  schemes  by  padding  at  the 
start.  However,  in  practice  almost  all  padding  is  applied  to  the  end  of  a  message. 

We  define  our  five  padding  schemes  as  follows: 


Method  0:  Let  v  denote  b  —  \m\  (mod  b).  Add  v  zeros  to  the  end  of  the  message 


m 


i.e. 


ra||pad0(|m|,  b)  =  ra||0*. 

Method  1:  Let  v  denote  b  —  (\m\  +  1)  (mod  b).  Append  a  single  1  bit  to  the  message,  and 
then  pad  with  v  zeros,  i.e.  m||pad1(|m|,  b)  =  ra||10*. 

Method  2:  Let  v  denote  b—  (|ra|  +65)  (mod  b).  Encode  \m\  as  a  64-bit  integer  i.  Append 
a  single  1  bit  to  the  message,  and  then  pad  with  v  zeros,  and  then  append  the  64-bit 
integer  £,  i.e.  ra||pad2(|ra|,  b)  =  ra||10*||A 

Method  3:  Let  v  denote  b  —  (|ra|  +  64)  (mod  b).  Encode  \m\  as  a  64-bit  integer  £.  Pad 
with  v  zeros,  and  then  append  the  64-bit  integer  i.e.  ra||pad3(|ra|,  b)  =  m||0*||T 
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•  Method  4:  Let  v  denote  b  —  (|ra|  +2)  (mod  b).  Append  a  single  1  bit  to  the  message,  and 
then  pad  with  v  zeros,  and  then  add  a  one-bit,  i.e.  ra||pad4(|ra|,  b)  =  ra||10*l. 

Notice  that  in  all  of  these  methods,  bar  method  zero,  given  a  padded  message  it  is  easy  to  work  out 
the  original  message.  For  method  zero  we  cannot  tell  the  difference  between  a  message  consisting 
of  one  zero  bit  and  two  zero  bits!  We  will  see  later  why  this  causes  a  problem.  In  addition,  for 
methods  one  to  four,  if  two  messages  are  of  equal  length  then  the  two  pads  produced  are  equal.  In 
addition,  for  padding  methods  two  and  three,  if  the  messages  are  not  of  equal  length  then  the  two 
pads  are  distinct. 

Before  proceeding  we  pause  to  note  that  any  of  these  padding  schemes  can  be  used  with  our 
earlier  symmetric  encryption  schemes  based  on  block  ciphers,  excluding  padding  method  zero. 
However,  when  considering  the  functions  in  this  chapter  the  precise  padding  scheme  will  have  an 
impact  on  the  underlying  properties  of  the  functions,  in  particular  the  security. 


14.3.  The  Merkle-Damgard  Construction 

In  this  section  we  describe  the  basic  Merkle-Damgard  construction  of  a  hash  function  taking  inputs 
of  arbitrary  length  from  a  hash  function  which  takes  inputs  of  a  fixed  length.  The  building  block  (a 
hash  function  taking  inputs  of  a  fixed  length)  is  called  a  compression  function,  and  the  construction 
is  very  much  like  a  mode  of  operation  of  a  block  cipher.  The  construction  was  analysed  in  two  papers 
by  Merkle  and  Damgard,  although  originally  in  the  context  of  unkeyed  compression  functions. 


MD  [fk,s 


Figure  14.6.  The  Merkle-Damgard  construction  MD[/^,s 

See  Figure  14.6  for  an  overview  of  the  construction  and  Algorithm  14.1  for  an  algorithmic 
viewpoint.  The  construction  is  based  on  a  family  of  compression  functions  /&  which  map  [£  +  n)~ 
bit  inputs  to  n-bit  outputs.  The  construction  also  makes  use  of  an  internal  state  variable  Si  which 
is  updated  by  the  application  of  the  compression  function.  At  each  iteration  this  internal  state  is 
updated,  by  taking  the  current  state  and  the  next  message  block  and  applying  the  compression 
function.  At  the  end  the  internal  state  is  output  as  the  result  of  the  hash  function,  which  we  denote 
by  MD[/fc,s]. 


Algorithm  14.1:  Merkle-Damgard  construction 

Pad  the  input  message  m  using  a  padding  scheme,  so  that  the  output  is  a  multiple  of  £  bits 
in  length. 

Divide  the  input  m  into  t  blocks  of  length  £  bits,  m i, . . . ,  uif. 


<Sq  ^ —  5. 


for  i  =  1  to  t  do 


Si  <-  fk(rrii 


Si- 


return  Sf. 
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14.3.1.  Theoretical  Properties  of  the  Merkle-Damgard  Construction:  The  problem  with 
the  MD[/jt,s]  construction  is  that  collision  resistance  depends  on  the  padding  scheme  being  used. 
The  following  toy  example  illustrates  the  problem.  Suppose  we  have  a  function  /&  which  has 
n  —  i  —  4  in  the  above  notation,  i.e.  it  takes  as  input  bit  strings  of  length  eight  bits,  and  outputs 
bit  strings  of  length  four. 

Consider  applying  the  following  function  to  the  messages,  one  which  is  one  bit  long  and  one 
which  is  two  bits  long. 

rn  =  060,  rn  =  0600. 

The  output  of  the  basic  Merkle-Damgard  construction,  when  used  with  padding  method  zero,  will 


i.e.  we  obtain  a  collision.  The  problem  is  that  padding  method  zero  does  not  provide  a  unique 
way  of  decoding  to  obtain  the  original  message.  Thus  060000  could  correspond  to  the  message  060, 
or  0600,  or  06000  or  even  060000.  So  when  using  the  function  MD[/fc,s]  one  should  always  use 
padding  method  one,  two  or  three.  In  practice,  all  the  standardized  hash  functions  based  on  the 
Merkle-Damgard  construction  use  padding  method  two.  The  use  of  padding  method  two  will  be 
exploited  in  the  proof  of  Theorem  14.4  below. 

Theorem  14.4.  Let  H^^s{m)  =  MDJ/&,  s](m)  denote  the  keyed  hash  function  constructed  using 
the  Merkle-Damgard  method  from  the  keyed  compression  function  {fk{x)}K  family  as  above,  using 
padding  method  two.  Then  if  {fk(x)}K  is  CR/wCR  secure,  then  so  is  Hk,s- 


Proof.  Suppose  A  is  an  adversary  against  the  CR/wCR  security  of  U^sfra)-  From  A  we  wish  to 
build  an  adversary  B  against  the  CR/wCR  security  of  the  family  {fk(x)} k-  Algorithm  B  will  either 
have  as  input  k  (for  the  CR  game)  or  have  access  to  an  oracle  to  compute  /&  for  a  fixed  value  of  k 
(for  the  wCR  game).  Algorithm  B  then  picks  a  random  s  and  passes  it  to  A  (note  that  if  we  are 
assuming  5  is  fixed  and  public  then  this  step  can  be  missed).  In  addition  B  either  provides  A  with 
k,  or  (in  the  case  of  wCR  security)  provides  an  oracle  for  Hk^s(m)  created  from  IT s  own  access  to 
the  oracle  for  computing  /*.. 

The  adversary  A  will  output,  with  some  non-zero  probability,  a  collision  H/cjS(m)  —  H/e?s(m/), 
for  which  m  7^  m! .  Let  us  assume  that  m  is  f  blocks  long  and  m'  is  t'  blocks  long,  and  that 
(without  loss  of  generality)  the  final  added  padding  block  does  not  produce  a  new  block.  So  the 
actual  messages  hashed  are 

mi,  m2,  •  •  • ,  nil 1 1 pb  and  m^,  m2, . . . ,  m[,  ||pb7, 

where  pb  and  pb;  denote  the  specific  public  padding  blocks  added  at  the  end  of  the  messages. 

We  now  unpeel  the  function  M  D  [/& ,  s]  one  layer  at  a  time.  We  know  that  we  have,  since  we 
have  a  collision,  that 

/fc(mf||pb||st_i)  =  fk(m'ti\\pb'\\st'-i)- 

Now,  unless  (mt||pb||st_i)  =  {m't,  Upb^Sj-i)  then  we  have  that  we  have  found  a  collision  in  fk 
and  algorithm  B  just  outputs  this  collision.  So  suppose  these  tuples  are  equal,  in  particular  that 
the  last  64  bits  of  each  of  the  padding  blocks  are  equal.  This  last  fact  implies,  since  we  are  using 
padding  method  two,  that  the  two  messages  are  of  equal  length  and  so  t  —  t' .  It  also  means  that 
the  two  chaining  variables  from  the  last  round  are  equal  and  so  we  have  a  new  equation  to  analyse 


fk(m- 1 


St- 2)  =  fk(m't_  1 


So  we  either  have  a  collision  now,  or  the  two  pairs  of  input  are  equal.  Continuing  in  this  way 
we  either  produce  a  collision  on  fj~  or  the  two  input  messages  are  identical,  i.e.  m  =  vn! ,  but  we 
assumed  this  did  not  happen.  Thus  we  must  find  a  collision  in  /&.  □ 


The  main  problem,  from  a  theoretical  perspective,  with  the  Merkle-Damgard  construction  is 
that  we  can  think  of  it  in  one  of  three  ways. 
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(1)  In  practice  the  value  s  is  fixed  to  a  given  value  IV,  and  the  function  /*.  is  not  taken  from 
a  keyed  function  family,  but  is  taken  as  a  fixed  function  /.  One  then  interprets  Theorem 
14.4  as  saying  that  if  the  function  f  is  HI-CR  secure  then  H(rn)  =  MD[  f,IV]  is  also  HI-CR 
secure. 

(2)  In  practice  we  can  also  think  of  sq  as  defining  a  “key” ,  but  with  the  function  fj~  still  being 
fixed.  This  viewpoint  will  be  useful  when  defining  HMAC  below.  In  this  case  one  interprets 
Theorem  14.4  as  saying  that  if  the  function  /  is  HI-CR  secure  then  Hs(m)  =  MD  [f,s]  is 
wCR  secure  and  CR  secure. 

(3)  If  we  are  able  to  select  /&  from  a  family  of  pseudo-random  functions,  then  we  take  5 
to  be  a  fixed  IV  and  Theorem  14.4  says  that  if  the  function  /&  is  CR  secure  then  so  is 
H^{m)  —  MD [fk,IV].  Whilst  this  is  the  traditional  result  in  theoretical  cryptography,  we 
note  that  it  means  absolutely  nothing  in  practice. 

Thus  we  have  three  ways  of  thinking  of  the  Merkle-Damgard  construction;  two  are  useful  in  practice 
and  one  is  useful  in  theory.  So  in  this  case  theory  and  practice  are  not  aligned. 

Another  property  we  will  require  of  the  compression  function  used  within  the  Merkle-Damgard 
construction  is  that  the  fixed  function  f(m\\s)  is  a  secure  message  authentication  code  on  (£  —  65)- 
bit  messages,  when  we  consider  5  as  the  key  to  the  MAC  and  padding  method  two  is  applied.  This 
property  will  be  needed  when  we  construct  HMAC  below.  We  cannot  prove  this  property  for  any 
of  the  specific  instances  of  the  function  /  considered  below;  it  is  simply  an  assumption,  much  like 
the  assumption  that  AES  defines  a  secure  pseudo-random  permutation. 

We  now  turn  to  discussing  the  preimage  and  second  preimage  resistance  of  the  Merkle-Damgard 
construction.  To  do  this  we  make  the  following  assumption  about  the  function  /(m||s),  when 
considered  as  a  function  of  two  inputs  m  and  5.  This  is  non-standard  and  is  made  to  make  the 
following  discussion  slightly  simpler. 

Definition  14.5.  A  function  of  two  inputs  f(x,y)  where  x  G  X,y  G  Y  and  f{x,y)  G  Z  is  said 
to  be  uniformly  distributed  in  its  first  component  if,  for  all  values  of  y,  the  values  of  the  function 
fy(x )  =  f(x,y)  are  uniformly  distributed  in  Z  as  x  ranges  uniformly  over  X. 

This  definition  is  somewhat  reasonable  to  assume  if  the  set  X  is  much  larger  than  Z,  which  it 
will  be  in  all  of  the  hash  functions  resulting  from  the  Merkle-Damgard  construction,  and  /  is  well 
designed.  Using  this  we  can  show: 

Theorem  14.6.  Let  A  be  an  adversary  which  finds  preimages /second  preimages  for  the  hash  func¬ 
tion  H(m)  =  W\D[f ,  IV],  assume  that  f  is  uniformly  distributed  in  its  first  component,  and  that  the 
first  domain  component  is  much  larger  than  the  codomain,  then  there  is  an  adversary  B  which  can 
find  preimages /second  preimages  in  /. 

Proof.  We  show  the  result  for  preimage  resistance;  for  second  preimage  resistance  we  follow 
roughly  the  same  argument.  Let  h  be  the  input  to  the  algorithm  B.  We  pass  h  to  the  adversary 
A  to  obtain  a  preimage  m  of  the  hash  function.  Note  that  we  can  do  this  since  /  is  uniformly 
distributed  in  its  first  component,  and  hence  the  value  h  “looks  like”  a  value  which  could  be  output 
by  the  hash  function.  Hence,  algorithm  A  will  produce  a  preimage  with  its  normal  probability. 

We  now  run  the  hash  function  forwards  to  obtain  the  input  to  the  function  for  the  last  round. 
This  input  is  then  output  by  algorithm  B  as  its  preimage  on  /.  □ 

14.4.  The  MD-4  Family 

The  most  widely  deployed  hash  functions  are  MD-5,  RIPEMD-160,  SHA-1  and  SHA-2,  all  of  which 
are  based  on  the  Merkle-Damgard  construction  using  a  fixed  (i.e.  unkeyed)  compression  function 
/.  The  MD-5  algorithm  produces  outputs  of  128  bits  in  size,  whilst  RIPEMD-160  and  SHA-1 
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both  produce  outputs  of  160  bits  in  length,  whilst  SHA-2  is  actually  three  algorithms,  SHA-256, 
SHA-384  and  SHA-512,  having  outputs  of  256,  384  and  512  bits  respectively.  All  of  these  hash 
functions  are  derived  from  an  earlier  simpler  algorithm  called  MD-4. 

The  seven  main  algorithms  in  the  MD-4  family  are 

•  MD-4:  The  function  has  3  rounds  of  16  steps  and  an  output  bit  length  of  128  bits. 

•  MD-5:  The  function  /  has  4  rounds  of  16  steps  and  an  output  bit  length  of  128  bits. 

•  SHA-1:  The  function  has  4  rounds  of  20  steps  and  an  output  bit  length  of  160  bits. 

•  RIPEMD-160:  The  function  has  5  rounds  of  16  steps  and  an  output  bit  length  of  160 
bits. 

•  SHA-256:  The  function  has  64  rounds  of  single  steps  and  an  output  bit  length  of  256 
bits. 

•  SHA-384:  The  function  /  is  identical  to  SHA-512  except  the  output  is  truncated  to  384 
bits,  and  the  initial  chaining  value  H  is  different. 

•  SHA-512:  The  function  has  80  rounds  of  single  steps  and  an  output  bit  length  of  512 
bits. 

We  discuss  MD-4,  SHA-1  and  SHA-256  in  detail;  the  others  are  just  more  complicated  versions 
of  MD-4,  which  we  leave  to  the  interested  reader  to  look  up  in  the  literature.  In  recent  years  a 
number  of  weaknesses  have  been  found  in  almost  all  of  the  early  hash  functions  in  the  MD-4  family, 

for  example  MD-4,  MD-5  and  SHA-1.  Hence,  it  is  wise  to  move  all  application  to  use  the  SHA-2 

algorithms,  or  the  new  sponge-based  SHA-3  algorithm  discussed  later. 

14.4.1.  MD-4:  We  stress  that  MD-4  should  be  considered  broken ;  we  only  present  it  for  illustrative 
purposes  as  it  is  the  simplest  of  all  the  constructions.  In  MD-4  there  are  three  bit-wise  functions 
on  three  32-bit  variables 

f(u,  u,  w)  =  (u  A  v)  V  ((-i u)  A  re), 

g(u,  u,  w)  =  (u  A  v)  V  (u  A  w)  V  (v  A  w), 

h(u ,  v,w)  =  u  0  v  0  w. 

Throughout  the  algorithm  we  maintain  a  current  hash  state,  corresponding  to  the  value  Si  in  our 
discussion  above. 

H=(H1,H2,H3,H4) 

of  four  32-bit  values.  Thus  the  output  length,  is  128  bits  long,  with  the  input  length  to  the 
compression  function  /  being  512  +  128  =  640  bits  in  length.  There  are  various  fixed  constants 
(yi,  Zi,Wi ),  which  depend  on  each  round.  We  have 

(  0  0  <  j  <  15, 

yj  =  <  0x5A827999  16  <  j  <  31, 

[  0x6ED9EBAl  32  <  j  <  47. 

and  the  values  of  Z{  and  W{  are  given  by  the  following  arrays, 

S0...15  =  [0, 1,  2,  3, 4, 5, 6,  7,  8, 9, 10, 11, 12, 13, 14, 15], 

M6...31  =  [o,  4,  8, 12, 1,  5,  9, 13,  2,  6, 10, 14,  3,  7, 11, 15], 

^32. ..47  =  [o,  8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11,  7, 15], 

W0...15  =  [3,  7, 11, 19, 3,  7, 11, 19,  3,  7, 11, 19,  3,  7, 11, 19], 

W16...31  =  [3,  5,  9, 13, 3,  5,  9, 13, 3, 5,  9, 13, 3, 5,  9, 13], 

W32...47  =  [3, 9, 11, 15,  3, 9, 11, 15,  3, 9, 11, 15,  3, 9, 11, 15]. 

We  then  execute  the  steps  in  Algorithm  14.2  for  each  16  words  entered  from  the  data  stream, 
where  a  word  is  32  bits  long  and  denotes  a  bit-wise  rotation  to  the  left.  The  data  stream  is 
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first  padded  with  padding  method  two,  so  that  it  is  an  exact  multiple  of  512  bits  long.  On  each 
iteration  of  Algorithm  14.2,  the  data  stream  is  loaded  16  words  at  a  time  into  Xj  for  0  <  j  <  16. 


Algorithm  14.2:  The  MD-4  /(X||sr i)  compression  function 


(A,  B ,  C,  D)  4r-  sr-  1  =  (tfi,  #2,  H4). 

/*  Round  1  */ 
for  j  —  0  to  15  do 

t  A-  A  A  / (R,  C, D)  A  Xz.  A  i/j . 

(A,  R,  C,  D)  v-  (D,t  AA  Wj,B,C). 

/*  Round  2  */ 
for  j  =  16  to  31  do 

t  A-  A  A  g(B ,  C,  D)  A  Wj  +  yj . 
(A,R,C,  D)  v-  (D,t  AA  Wj,B,C). 

/*  Round  3  */ 
for  j  =  32  to  47  do 


t  i —  A  A  /i(R ,  .D)  A  +  i/j . 

(A,  B ,  C,  D)  v-  (D,  t  AA  icj,  5,  C). 

(Ri, H2, 773, 7/4)  v-  sr  =  (ffi  AA,R2  A5,H3  AC,R4  AD). 


After  all  data  has  been  read  in,  the  output  is  the  concatenation  of  the  final  value  of 

HUH2,H3,H4. 

When  used  in  practice  the  initial  value  so  is  initialized  with  the  fixed  values 

Hi  V-  0x67452301,  H2  V-  0xEFCDAB89, 

H3  V-  0x98BADCFE,  R4  V-  0x10325476. 


14.4.2.  SHA-1:  When  discussing  SHA-1  it  becomes  clear  that  it  is  very,  very  similar  to  MD-4, 
for  example  we  use  the  same  bit-wise  functions  /,  g  and  h  as  in  MD-4.  However,  for  SHA-1  the 
internal  state  of  the  algorithm  is  a  set  of  five,  rather  than  four,  32-bit  values 

resulting  in  a  key/output  size  of  160  bits;  the  input  data  size  is  still  kept  at  512  bits  though.  We 
now  only  define  four  round  constants  2/1 , 2/2  ?  2/3  ?  2/4  via 

yi  =  0x5A827999, 

y2  =  0x6ED9EBAl, 

y3  =  0x8FlBBCDC, 

2/4  =  0xCA62ClD6. 

After  padding  method  two  is  applied,  the  data  stream  is  loaded  16  words  at  a  time  into  Xj  for 
0  <  j  <  16,  although  note  that  internally  the  algorithm  uses  an  expanded  version  of  Xj  with  indices 
from  0  to  79.  We  then  execute  the  steps  in  Algorithm  14.3  for  each  16  words  entered  from  the  data 
stream.  After  all  data  has  been  read  in,  the  output  is  the  concatenation  of  the  final  value  of 

Hx,H2,H3,H^H3. 

Note  the  one-bit  left  rotation  in  the  expansion  step;  an  earlier  algorithm  called  SHA  (now  called 
SHA-0)  was  initially  proposed  by  NIST  which  did  not  include  this  one-bit  rotation.  However,  this 
was  soon  replaced  by  the  new  algorithm  SHA-1.  It  turns  out  that  this  single  one-bit  rotation 
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Algorithm  14.3:  The  SHA-1  /(X||sr i)  compression  function 


(A,  B ,  C,  A  E )  <-  sr_i  =  (HUH2,  Hs,  tf4,  H6). 

/*  Expansion  */ 
for  j  =  16  to  79  do 

|_  Xj  A-  ((Xj— 3  ®  Xj-8  ®  X,_14  ®  Xj- 16)  1). 

/*  Round  1  */ 
for  j  —  0  to  19  do 

£  i —  (A  5 )  T  / (A  (7,  D')  A  E  A  Aj  A  yi . 

_  (A,  H,  (7,  A  £)  a-  (£,  A,  B  30,  (7,  D). 

/*  Round  2  */ 
for  j  =  20  to  39  do 

£  =  (A  5)  A  h(.E>,  (7,  17)  A  E  +  Ay  A  y2. 

_  (A,  H,  (7,  A  A)  a-  (£,  A,  B  30,  (7,  17). 

/*  Round  3  */ 
for  j  =  40  to  59  do 

£  =  (A  5)  A  q(E ,  (7,  17)  A  A  A  Ay  A  2/3 • 

_  (A,  5,  C,  17,  E)  a-  (£,  A,5«  30,  (7,  17). 

/*  Round  4  */ 
for  j  =  60  to  79  do 

£  =  (A  5)  A  h(R,  (7,  17)  A  E  A  Xj  A  2/4 - 
_  (A,  5,  (7,  A  E)  a-  (£,  A,  5  30,  (7,  17). 

(A,  #2,  #3>  #4,  #5)  A-  sr  =  (A  A  A,  A  A  AA  A  AAA  +  4 


improves  the  security  of  the  resulting  hash  function  quite  a  lot,  since  SHA-0  is  now  considered 
broken,  whereas  SHA-1  is  considered  just  about  alright  (but  still  needing  to  be  replaced). 

To  obtain  the  standardized  version  of  SHA-1,  the  initial  state  so  is  initialized  with  the  values 

Hi  A-  0x67452301,  H2  A-  0xEFCDAB89, 

H3  A-  0x98BADCFE,  H4  A-  0x10325476, 

H5  A-  0xC3D2ElF0. 


14.4.3.  SHA-2:  We  only  present  the  details  of  the  SHA-256  variant;  the  others  in  the  SHA-2 
family  are  relatively  similar.  Unlike  the  other  members  of  the  MD-4  family,  the  SHA-2  algorithms 
consist  of  a  larger  number  of  rounds,  each  of  one  step.  For  an  arbitrary  input  message  m,  SHA- 
256  produces  a  256-bit  message  digest  (or  hash).  The  length  l  of  the  message  nn  is  bounded  by 
0  <  l  <  264,  due  to  the  use  of  the  standard  MD-4  family  padding  procedure,  namely  what  we  have 
called  padding  method  two. 

SHA-256  processes  the  input  message  block  by  block,  where  each  application  of  the  /^(ra) 
function  is  a  function  of  64  iterations  of  a  single  step.  The  step  function  makes  use  of  slightly 
different  /  and  g  functions  than  those  used  in  MD-4  and  SHA-1.  The  SHA-2  /  and  g  functions  are 
given  by 

f(u,  v,  w)  =  (u  A  v)  0  ((-itt)  A  re), 
g  (u ,  v,  w)  =  (u  A  v)  0  (u  A  w)  ®  (v  A  w). 
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SHA-2  also  makes  use  of  the  following  functions  on  single  32-bit  words 

^o0)  =  (x  ^  2)  ©  {x  13)  ©  0  22), 

^  (ff)  =  (x  6)  0  (x  11)  0  (x  25), 

cro(x)  =  (x  7)  0  (x  18)  0  (x  >  3), 
cri(x)  =  (x  ^$>  17)  0  (x  19)  0  (x  >  10), 

where  denotes  right  rotate,  and  0>  denotes  shift  right.  There  are  64  constant  words  Ko, . . . ,  F33, 
which  represent  the  first  32  bits  of  the  fractional  parts  of  the  cube  roots  of  the  first  64  prime  numbers. 
For  SHA-256  the  internal  state  of  the  algorithm  is  a  set  of  eight  32-bit  values 

H  =  {HUH2,  H3,  Ih,  H6,  II 7,  Hs), 

corresponding  to  256  bits  of  key/output.  Again  the  input  is  processed  512  bits  at  a  time.  The  data 
stream  is  loaded  16  words  at  a  time  into  Xj  for  0  <  j  <  16,  which  are  then  expanded  to  64  words 
as  in  Algorithm  14.4.  After  all  data  have  been  read  in,  the  output  is  the  concatenation  of  the  final 
values  of 

Fi,F2,F3,F4,F5,F6,F7,F8. 

The  standard  defines  the  initial  state  sq  for  SHA-256  to  be  given  by  setting  the  initial  values  to 


Algorithm  14.4:  The  SHA-256  f(X\\sr-i)  compression  function 


(A,F,F,F,F,F,F,F)  <-  (Fi,  F2,  F3,  F4,  F5,  F6,  F7,  F8). 

/*  Expansion  */ 
for  j  =  17  to  64  do 

|_  Xi  v-  <7i(A^_2)  0  Xi_7  0  <7o(A^_15)  0  Xi_iQ. 

/*  Round  */ 
for  j  —  1  to  64  do 

h  0-  H  0  El (E)  0  f'(E ,  F,  G)  0  Ki  0  A,. 

F  <—  Eo(^)  +  9* (A,  B ,  C). 

(A,  F,  (7,  F,  F,  F,  G ,  F)  i —  (t\  0  t2?  A,  F,  F,  D  0  F,  F,  F,  F). 

(FT1?  FT2,  FT3,  FT4,  FT5,  FT6,  FT7,  FTs)  <-  sr  = 

(Fx  0  A,  F2  0  F,  F3  0  F,  F4  0  F,  F5  0  F,  F6  0  F,  F7  0  F,  F8  0  F). 


Hi  V-  0x6A09E667, 
F3  V-  0x3C6EF372, 
F5  V-  0x510E527F, 
F7  V-  0xlF83D9AB, 


F2 

f4 

f6 

f8 


0xBB67AE85, 

0xA54FF53A, 

0x9B05688C, 

0x5BE0CD19. 


14.5.  HMAC 

It  is  very  tempting  to  define  a  MAC  function  from  an  unkeyed  hash  function  as 

t  =  F(fc||m||padi(|/c|  0  |m|,6)) 

where  k  is  the  key  for  the  MAC.  After  all  a  hash  function  should  behave  like  a  random  oracle, 
and  a  random  oracle  by  definition  will  be  a  secure  MAC2.  However,  if  we  use  the  Merkle-Damgard 
construction  for  F  then  there  is  a  simple  attack.  Let  MD[/,  5]  denote  one  of  the  standardized 
Merkle-Damgard-based  hash  functions  with  a  fixed  compression  function  f  and  a  fixed  IV  s.  For 


2It  is  perhaps  worth  proving  this  to  yourself  using  the  earlier  games  for  a  MAC  and  a  random  oracle. 
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simplicity  assume  the  key  k  for  the  MAC  is  one  block  long,  i.e.  £  bits  in  length.  The  adversary  first 
obtains  a  MAC  t  on  the  message  ra,  by  asking  for 


t=N\D[f,s](k  m  pad^(T  +  |ra|, £) 


We  now  know  the  state  of  the  Merkle-Damgard  function  at  this  point,  namely  t.  The  adversary 
can  then  on  his  own  compute  the  tag  on  any  message  of  the  form 


m 


pad^fT  +  |ra|,  £) 


for  an  m'  of  the  adversaries  choice.  To  get  around  this  problem  we  define  a  function  called  HMAC, 
for  Hash-MAC.  Since  HMAC  is  specifically  designed  to  avoid  the  above  problem  with  the  Merkle- 
Damgard  construction,  it  only  makes  sense  to  use  it  with  hash  functions  created  in  this  way.  To 
define  HMAC  though  we  first  create  a  related  MAC  which  is  easier  to  analyse  called  NMAC. 


14.5.1.  NMAC:  Nested  MAC ,  called  NMAC,  is  built  from  two  keyed  hash  functions  F ^  and 
Gk2 •  The  function  NMAC  is  then  defined  by 


NMACfeljfe2(m)  =  Fkl  ( Gk2(m )) . 

In  particular  we  assume  that  F ^{x)  corresponds  to  a  single  application  of  a  Merkle-Damgard 
(unkeyed)  compression  function  /  with  input  size  £,  output  size  n,  such  that  k\  E  {0, 1 Y  and 
\x\  +  n  +  66  <  £,  and  initial  state  ki,  but  with  a  slightly  strange  padding  method,  namely 

Fkl(x)  =  f  ((x  pad2(£+  |*U))  Pi)  =  MD[/,/ci]*(x). 

We  let  MD[f,  &q]*  denote  the  function  MD[f,  ki\(x)  with  this  slightly  modified  padding  formula.  We 
assume  that  F ^  is  a  secure  message  authentication  code,  which  recall  was  one  of  our  assumptions 
on  the  Merkle-Damgard  compression  functions  we  described  above. 

The  function  Gk2  is  a  wCR  secure  hash  function,  which  produces  outputs  of  size  b  with  b  + 
n  +  66  <  £.  In  practice  we  will  take  b  =  n,  since  we  will  construct  G^2  out  of  the  same  basic 
compression  function  /,  with  the  same  modified  padding  scheme  above  (namely  the  length  gets 
encoded  by  adding  an  extra  £  bits).  Thus 


Gfe2(m)  =  MD[/,fc2]*(m). 


It  is  easily  checked  that  with  this  modified  padding  method  Theorem  14.4  still  applies,  and  we  can 
hence  conclude  that  Gk2  is  indeed  a  wCR  secure  hash  function  assuming  /  is  HI-CR  secure.  Thus 
we  have 

NMACfclifca(m)  =  MD[/)fc1]*(MD[/,fe2]*(m)). 

We  then  have  the  following  theorem. 

Theorem  14.7.  The  message  authentication  code  NMAC  is  EUF-CMA  secure  assuming  F ^  is  a 
EUF-CMA  secure  MAC  on  b-bit  messages  and  G^2  is  a  wCR  secure  hash  function  outputting  b-bit 
messages. 


Proof.  Let  A  be  the  adversary  against  NMAC.  We  will  use  A  to  construct  an  adversary  B  against 
the  MAC  function  F^.  Algorithm  B  has  no  input,  but  has  access  to  an  oracle  which  computes 
Ffa  for  her.  First  B  generates  a  random  &2,  sets  a  list  C  to  0  and  then  calls  algorithm  A. 

When  algorithm  A  queries  its  MAC  oracle  on  input  m,  algorithm  B  responds  by  first  computing 
the  application  of  Gk2  on  m  to  obtain  y.  This  is  then  suitably  padded  and  passed  to  the  oracle 
provided  to  algorithm  B  to  obtain  t.  Thus  algorithm  B  obtains  an  NMAC  tag  on  m  under  the  key 
(&i,  fo),  and  passes  this  back  to  algorithm  A.  Before  doing  so  it  appends  the  pair  (y,t)  t°  !he  list 

C. 
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If  A  outputs  a  forgery  (ra*,£*)  on  NMAC  then  algorithm  B  computes  y*  =  Gk2(m*).  If  there 
exists  (?/*,£*)  G  C  then  B  aborts,  otherwise  B  returns  the  pair  as  its  EUF-CMA  forgery  for 

Fkl. 

Let  6 a  denote  the  probability  that  A  wins,  and  6b  the  probability  that  B  wins.  We  have 


G4  <  £b  +  Pr[L>  aborts] 


We  now  note  that  the  algorithm  B  could  also  be  used  to  find  collisions  in  Gk2  for  a  secret  value 
&2.  Instead  of  picking  k 2  at  random  we  pick  £q,  and  then  call  A  and  respond  using  the  oracle  for 
Gk2  and  then  applying  Fkl  for  the  known  k\.  Thus  Pr  [B  aborts]  is  bounded  by  the  probability  of 
breaking  the  wCR  security  of  Gfc2. 

Hence,  if  Fkl  is  an  EUF-CMA  secure  MAC  and  Gk2  is  wCR  secure,  then  the  two  probabilities 
6b  and  Pr [B  aborts]  are  “small”,  and  hence  so  is  6 a  and  so  NMAC  is  secure.  □ 


14.5.2.  Building  HMAC  from  NMAC:  We  can  now  build  the  function  FI  MAC  from  our  simpler 
function  NMAC.  The  function  HMAC  is  built  from  a  standardized  hash  function  given  by  H(m)  = 
MD [f,IV]  It  makes  use  of  two  padding  values  opad  (for  outer  pad)  and  ipad  (for  inner  pad).  These 
are  Abit  fixed  values  which  are  defined  to  be  the  byte  0x36  repeated  £/S  times  for  opad  and  the 
byte  0x5C  repeated  £/S  times  for  ipad.  The  keys  to  HMAC  are  a  single  Abit  value  k. 

We  then  define 

HMACfc(m)  =  H  ((k  0  opad)  \\H  ((/c  0  ipad)  || m)) 

=  MD [/,  IV]  ((/c  0  opad)  || M D [/,  IV]  ((k  0  ipad)  \\m)) . 

If  we  set  k\  =  f(k  0  opad)  and  k 2  =  f(k  0  ipad)  then  we  have 

HMACfc^M  =  MD[/,fci]*(MD[/,fc2]*(m))  =  NMAC^m). 

Thus  HMAC  is  a  specihc  instance  of  NMAC,  where  the  keys  are  derived  in  a  very  special  manner. 
Hence  we  also  require  that  the  output  of  f(m\\IV)  acts  like  a  weak  form  of  pseudo-random  function. 

The  astute  reader  will  have  noticed  that  when  instantiated  with  SHA-256  the  proof  of  HMAC 
will  not  apply,  since  the  outer  application  of  SHA-256  is  performed  on  a  message  of  three  blocks, 
one  for  the  key  k  0  opad,  one  on  the  output  of  the  inner  application  of  SHA-256,  and  one  on  the 
padding  block.  The  above  proof  of  reduction  to  NMAC  can  clearly  be  modified  to  cope  with  this, 
with  some  additional  assumptions  and  modifications  to  the  NMAC  proof.  However,  the  added 
complications  produce  no  extra  insight,  so  we  do  not  pursue  them  here.  The  interested  reader 
should  also  note  that  there  is  a  more  elaborate  proof  of  the  HMAC  construction  which  assumes 
even  less  of  the  components  used  to  define  the  underlying  Merkle-Damgard  hash  function. 


14.6.  Merkle— Damgard-Based  Key  Derivation  Function 

It  is  relatively  straightforward  to  define  KDFs  given  an  (unkeyed)  hash  function  from  the  Merkle- 
Damgard  family.  Recall  that  a  KDF  should  take  an  arbitrary  length  input  string  and  produce 
an  arbitrary  length  output  string  which  should  look  pseudo-random.  There  are  two  basic  ways  of 
doing  this  in  the  literature;  we  just  give  the  basic  ideas  behind  these  constructions  given  a  fixed 
hash  function  H  of  output  length  t  bits  and  block  size  b.  Let  the  number  of  output  bits  required 
from  the  KDF  be  n  and  set  cnt  =  \n/t].  We  let  truncn(m)  denote  the  truncation  of  the  message 
m  to  n  bits  in  length,  by  removing  bits  to  the  left  (the  most  significant  bits),  and  let  (i)v  denote 
the  encoding  of  the  integer  i  in  v  bits. 
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Method  1:  This  is  a  relatively  basic  method,  whose  security  rests  on  assuming  that  H  itself  acts 
like  a  random  oracle. 


KDF(m)  =  truncn^77  (ra||(cnt  —  1)64)  ||  H  (ra||(cnt  —  2 )64) 

•••  I  H MOM  II  #MI(°}64))- 


Thus  the  ith  output  block  is  given  by  H  (ra||(i)64),  except  for  the  (cnt  —  l)st  block  which  is  given 
by  truncn  (mod  t)  (H  (m || (cnt  —  1)64)).  This  acts  like  CTR  Mode  in  some  sense,  and  can  be  very 
efficient  if  we  first  pad  m  out  to  a  multiple  of  6  and  then  compute  k  =  MD[/,/k](m)  with  no 
padding  method  applied,  and  then  compute 


KDF(m)  =  truncn  ((cnt  —  1)64 1 1 pad2 (64  +  |m|,  b)\\k)  ||  j  ((cnt  —  2)64||pad2(64  +  |m|,  b)\\k) 

■■■  II  /(<1)64llPacl2(64+  |m|,6)||fe)  ||  /  ((0)64||pad2(64+  |m|,6)||fc)j. 


Method  2:  The  second  method  utilizes  the  fact  that  HMAC  itself  acts  like  a  pseudo-random  func¬ 
tion,  and  that  the  proof  of  HMAC  establishes  this  in  a  stronger  way  than  assuming  the  underlying 
hash  function  is  a  random  oracle.  Thus  the  second  method  is  based  on  HMAC  and  the  ith  output 
block  is  given  by 


FIMAC  (m\\ki-i  ||  (i 


(mod  256))g) , 


where  k-\  is  defined  to  be  the  zero  string  of  t  bits. 


14.7.  MACs  and  KDFs  Based  on  Block  Ciphers 

In  this  section  we  show  how  MACs  and  KDFs  can  also  be  derived  from  block  ciphers,  in  addition 
to  the  compression  functions  we  considered  in  the  last  section. 

14.7.1.  Message  Authentication  Codes  from  Block  Ciphers:  Some  of  the  most  widely  used 
message  authentication  codes  in  practice  are  based  on  the  CBC  Mode  of  symmetric  encryption,  and 
are  called  “CBC-MAC” .  However,  this  is  a  misnomer,  as  for  all  bar  a  limited  number  of  applications 
the  following  construction  on  its  own  does  not  form  a  secure  message  authentication  code.  However, 
we  will  return  later  to  see  how  one  can  form  secure  MACs  from  the  following  construction. 

Using  a  6-bit  block  cipher  to  give  a  6-bit  keyed  hash  function  is  done  as  follows: 

•  The  message  m  is  padded  to  form  a  series  of  6-bit  blocks;  in  principle  any  of  the  previous 
padding  schemes  can  be  applied. 

•  The  blocks  are  encrypted  using  the  block  cipher  in  CBC  Mode  with  the  zero  IV. 

•  Take  the  final  block  as  the  MAC. 

Hence,  if  the  6-bit  data  blocks,  after  padding,  are 

7741,7742,  .  .  .  ,  mq 

then  the  MAC  is  computed  by  first  setting  A  =  m\  and  0\  =  ek(Ii)  and  then  performing  the 
following  for  i  =  2,  3, . . . ,  q: 

Ii  =  774;  ©  Oi- !, 

The  final  value  t  =  Oq  is  then  output  as  the  result  of  the  computation.  This  is  all  summarized  in 
Figure  14.7,  and  we  denote  this  function  by  CBC-MAC/c(tt4). 

We  first  look  at  an  attack  against  CBC-MAC  with  padding  method  zero.  Suppose  we  have  a 
MAC  value  t  on  a  message 

7741,  m2  ,  •  •  •  ,  R4g, 
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t  =  Fk{m) 


Figure  14.7.  “CBC-MAC”:  flow  diagram 

consisting  of  a  whole  number  of  blocks.  Then  MAC  tag  £  is  also  the  MAC  of  the  double  length 
message 

mi,  m2,  •  •  • ,  mg,  £  0  mi,  m2,  m3, . . . ,  mq . 

To  see  this  notice  that  the  input  to  the  (q  0  l)st  block  cipher  envocation  is  equal  to  the  value  of  the 
MAC  on  the  original  message,  namely  £,  exclusive-or’d  with  the  (q  0  l)st  block  of  the  new  message 
namely,  namely  £  0  mi.  Thus  the  input  to  the  ( q  0  l)st  cipher  envocation  is  equal  to  mi.  This  is 
the  same  as  the  input  to  the  first  cipher  invocation,  and  so  the  MAC  on  the  double  length  message 
is  also  equal  to  £. 

One  could  suspect  that  use  of  more  elaborate  padding  techniques  would  make  attacks  impos¬ 
sible;  so  let  us  consider  padding  method  three.  Let  b  denote  the  block  length  of  the  cipher  and  let 
P(n)  denote  the  encoding  within  a  block  of  the  number  n.  To  MAC  a  single  block  message  m\  one 
then  computes 

Mi  =  ek  (efe(mi)  ©  P (£>)) . 

Suppose  one  obtains  the  MACs  t\  and  £2  on  the  single  block  messages  m\  and  m2.  Then  one 
requests  the  MAC  on  the  three-block  message 

mi,  P(6),  m3 

for  some  new  block  m3,  obtaining  the  tag  £3,  i.e. 

H  =  ek  (fife  (efc  (efc(mi)  ©  P(6))  ©  m3)  ©  P(3  •  b)) . 

Now  consider  what  is  the  MAC  value  on  the  three-block  message 

m2,  P(6),  m3  0  t\  0^2- 

This  tag  is  equal  to  £3,  where 

t'3  =  ek  (e*;  (ek  (ek(m2)  0  P(fc))  0  m3  0  £1  0  £2)  0  P(3  •  b )) 

=  ek(ek  (  ek  (efc(m2)  0  P(&))  0m3  0  ek  (efc (mi)  0  P(fc))  0  ek  (ek(m2)  0  P(fc))  )  0  P(3  •  b)] 

\  \v - - ^  ^ - v - "  N - V - -  / 

=  ek  (efc  (m3  0  ek  (ek(mi)  0  P (b)))  0  P(3  •  b)) 

=  ek  (ek  (ek  (ek(mi)  0  P(6))  0  m3)  0  P(3  •  b)) 

=  t3. 

Hence,  we  see  that  on  their  own  the  non-trivial  padding  methods  do  not  protect  against  MAC 
forgery  attacks.  However,  if  used  in  a  one-time  setting,  i.e.  for  providing  authentication  to  the 
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ciphertext  in  a  data  encapsulation  mechanism;  then  the  basic  CBC-MAC  construction  is  indeed 
secure. 

It  should  be  noted  that  if  we  put  the  length  as  the  first  block  in  the  message  then  this  padding 
method  does  produce  a  secure  MAC.  However,  this  is  not  used  in  practice  since  it  requires  the 
length  of  a  message  to  be  known  before  one  has  perhaps  read  it  in.  Thus  in  practice  a  different 
technique  is  used,  the  most  popular  of  which  is  very  similar  to  the  NMAC  construction  above.  We 
pick  two  block  cipher  keys  Aq,  Aq  at  random  and  set 


EMACfclifc2(m)  =  (CBC-M  ACfej  (m)). 


Just  like  NMAC  and  HMAC  the  inner  function  performs  a  M AC-like  operation  by  sending  a  long 
message  to  a  short-block-size  message,  and  then  the  outer  function  uses  the  input  to  produce  what 
looks  like  a  random  MAC  value.  However,  we  cannot  reuse  the  proof  of  NMAC  here,  since  that 
required  the  inner  function  to  be  weakly  collision  resistant  which  we  already  know  CBC-MAC  does 
not  satisfy.  Thus  a  new  proof  technique  is  needed. 

Theorem  14.8.  EMAC  is  a  secure  message  authentication  code  assuming  the  underlying  block 
cipher  e &  acts  like  a  pseudo-random  function.  In  particular  let  A  denote  an  adversary  against 
CBC-MAC  which  makes  q  queries  of  size  at  most  I  blocks  to  its  function  oracle.  Then  there  is  an 
adversary  B  such  that 


AH  EUF-CMA 
AavEMAC 


(A;  q)  <  2  ■  Adv™p (B)  + 


2  •  Ta 
2b 


+ 


1 

2*’ 


where  b  is  the  block  size  of  the  block  cipher  e &  and  T  =  q  •  I. 


Proof.  The  proof  technique  is  very  similar  to  the  proof  of  Theorem  13.6,  although  far  more 
intricate  in  analysis,  thus  we  only  sketch  the  details.  Just  like  in  the  previous  proof  we  switch  from 
the  actual  block  cipher  family  to  a  truly  random  permutation,  and  then  a  truly  random 

function.  So  from  now  on  we  assume  that  is  replaced  by  the  random  function  /i  and  e^2  is 
replaced  by  the  random  function  f<i.  As  per  the  proof  of  Theorem  13.6,  the  adversary  will  only 
notice  this  has  happened  if  she  causes  an  output  collision  on  one  of  the  random  functions.  The 
factor  of  two  in  the  theorem  on  the  Ad Vg/RP(H)  term  is  due  to  the  fact  that  we  switch  two  block 
ciphers  to  random  functions. 

We  note  that  a  truly  random  function  is  a  secure  message  authentication  code  by  definition, 
and  so  all  we  need  show  now  is  that  the  output  of  EM  AC  behaves  as  a  random  function  when 
the  consistuent  parts  are  random  functions,  irrespective  of  the  strategy  of  the  adversary.  Thus  we 
essentially  need  to  show  that  the  input  to  the  outer  layer  function  is  itself  random. 

The  only  way  the  adversary  could  exploit  the  actual  CBC-MAC  definition  to  obtain  a  non- 
random  output,  which  she  could  then  exploit  to  create  a  forgery,  would  be  to  obtain  a  collision 
on  the  inputs  to  one  of  the  calls  to  fi  or  /2,  and  note  she  must  do  so  without  actually  seeing  the 
outputs  to  the  calls,  bar  the  final  output  of  the  EMAC  function.  This  is  where  the  intricate  anaylsis 
comes  in,  which  we  defer  to  the  paper  references  in  the  Further  Reading  section.  □ 


14.7.2.  Key  Derivation  Function  from  Block  Ciphers:  It  is  now  relatively  easy  to  define 
a  key  derivation  function  based  on  block  ciphers.  We  use  the  fact  that  CBC-MAC  or  EMAC  act 
like  pseudo-random  functions  when  applied  to  a  long  string;  we  then  use  the  output  to  key  a  CTR 
Mode  operation.  For  the  “inner”  compressing  part  of  the  key  derivation  function  we  can  actually 
use  CBC-MAC  with  the  zero  key.  Thus  we  have  that  the  ith  block  output  by  the  KDF  will  be 

ek((i)b) 


where  k  =  CBC-MACo(m). 
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14.8.  The  Sponge  Construction  and  SHA-3 

Having  looked  at  how  the  Merkle-Damgard  construction  and  block  ciphers  can  be  used  to  define 
various  different  cryptographic  primitives,  we  now  present  a  more  modern  technique  to  create  hash 
functions,  message  authentication  codes,  key  derivation  functions  and  more.  This  modern  technique 
is  called  the  sponge  construction.  The  two  prior  techniques  use  relatively  strong  components, 
namely  PRPs  and  keyed  compression  functions,  both  of  which  satisfy  relatively  strong  security 
requirements. 

14.8.1.  The  Sponge  Construction:  The  sponge  construction  takes  a  different  approach;  as  its 
basic  primitive  it  takes  a  fixed  permutation  p ;  such  a  permutation  is  clearly  not  one-way.  Security 
is  instead  obtained  by  the  method  of  chaining  the  permutation  with  the  key,  the  input  message 
and  the  padding  scheme. 

The  entire  construction  is  called  a  sponge,  as  the  message  is  first  entered  into  the  sponge  in  a 
process  akin  to  a  sponge  absorbing  water.  Then  when  we  require  output  the  sponge  is  squeezed  to 
obtain  as  much  output  as  we  require.  The  sponge  maintains  an  internal  state  of  r  +  c  bits,  and 
the  permutation  p  acts  as  a  permutation  on  the  set  {0,  l}r+c.  The  value  r  is  called  the  rate  of  the 
sponge  and  the  value  c  is  called  the  capacity.  The  initial  state  is  set  to  zero  and  then  the  message 
is  entered  block  by  block  into  the  top  r  bits  of  the  state,  using  exclusive-or.  See  Figure  14.8  for  a 
graphical  description  of  how  a  (keyed)  sponge  works.  Note  how  padding  method  four  is  utilized; 
it  is  important  that  this  is  the  padding  method  used  in  order  to  guarantee  security  if  we  use  the 
same  permutation  in  a  sponge  with  different  values  for  the  rate.  In  particular  there  is  no  need  to 
add  length  encodings  as  in  the  Merkle-Damgard  construction.  To  define  an  unkeyed  hash  function 
one  simply  sets  the  key  k  to  be  the  empty  string. 


Figure  14.8.  The  sponge  construction  Sp  [p\ 

The  idea  is  that  as  we  squeeze  the  sponge  we  obtain  r  bits  of  output  at  a  time.  However,  the 
output  tells  us  nothing  about  the  c  hidden  bits  of  the  state.  Thus  to  fully  predict  the  next  set  of 
r  output  bits  we  appear  to  need  to  guess  the  c  bits  of  missing  state,  or  at  least  find  a  collision  on 
these  c  bits  of  missing  state.  A  careful  analysis  reveals  that  for  a  random  permutation  p  the  sponge 
construction  Sp [p\  has  security  equivalent  to  2C/2  bits  of  symmetric  cipher  security. 

One  can  show,  using  ideas  and  techniques  way  beyond  what  we  can  cover  in  this  book,  that  the 
sponge  construction,  even  for  a  zero  key,  cannot  be  distinguished  from  a  random  oracle  (in  some 
well-defined,  but  complicated  sense).  Recall  from  Chapter  11  that  a  random  oracle  is  a  function 
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which  behaves  like  a  random  function,  even  though  the  adversary  may  have  access  to  the  function’s 
“code” . 


14.8.2.  SHA-3:  SHA-3  was  a  competition  organized,  much  like  the  AES  competition,  by  NIST. 
The  competition  was  launched  in  2007  in  response  to  the  spectacular  improvement  in  cryptanalyis 
of  the  MD-5  and  SHA-1  algorithms  in  the  preceding  years.  There  were  64  competition  entries, 
which  were  reduced  to  five  by  2010,  these  being 

•  BLAKE,  a  proposal  based  on  the  ChaCha  stream  cipher. 

•  Grpstl,  a  Merkle-Damgard  construction  using  components,  such  as  the  S-Box,  from  AES. 

•  JH,  a  sponge-like  construction  with  a  similar  design  philosophy  to  AES. 

•  Keccak,  a  sponge  construction,  and  the  eventual  winner. 

•  Skein,  a  function  based  on  the  Threehsh  block  cipher. 

Keccak,  designed  by  Joan  Daemen,  Guido  Bertoni,  Michael  Peeters  and  Gilles  Van  Assche,  was 
declared  the  winner  in  October  2012. 

The  final  SHA-3  function  is  a  sponge-construction-based  (unkeyed)  hash  function  with  a  specific 
permutation  p  (which  we  shall  now  define),  and  with  a  zero-length  key  in  the  main  sponge;  see 
Figure  14. 83.  The  SHA-3  winner  Keccak  actually  defines  four  different  hash  functions,  with  different 
output  lengths  (just  as  SHA-2  defines  four  different  hash  functions). 

Being  a  sponge  construction  Keccak  is  parametrized  by  two  values:  the  rate  r  and  the  capacity 
c.  In  Keccak  the  r  and  c  values  can  be  any  values  such  that  r  +  c  E  {25,  50, 100,  200, 400,  800, 1600}, 
since  we  require  r  +  c  to  be  equal  to  25  •  2^  for  some  integer  value  £  E  {1, . . . ,  6}.  This  is  due  to  the 
way  the  internal  state  of  SHA-3  is  designed,  as  we  shall  explain  in  a  moment.  If  output  hash  sizes 
of  224,  256,  384  and  512  bits  are  required  then  the  values  of  r  should  be  chosen  to  be  1152,  1088, 
832  and  576  repectively.  If  we  go  for  the  most  efficient  variant  we  want  r  +  c  =  1600,  and  then 
the  associated  capacities  are  448,  512,  768  and  1024  respectively.  If  an  arbitrary  output  length  is 
required,  for  example  for  when  used  as  a  stream  cipher  or  as  a  key  derivation  function,  then  one 
should  use  the  values  (r,  c)  =  (576, 1024). 

The  state  of  SHA-3  consists  of  a  5  x5x2e  three-dimensional  matrix  of  bits.  We  let  A[x,y,z\ 
denote  this  array.  If  we  fix  y  then  the  set  A[-,y,  •]  is  called  a  plane,  whereas  if  we  fix  z  then  the  set 
A[«,  •,  z\  is  called  a  slice,  finally  if  we  fix  x  then  the  set  A[x,  •,  •]  is  called  a  sheet.  One-dimensional 
components  in  the  x,  y  and  z  directions  are  called  rows,  columns  and  lanes  respectively.  See  Figure 
14.9. 

The  bit  array  A[x,  y,  z]  is  mapped  to  a  bit  vector  a[i\  using  the  convention  i  =  (5  •  y  +  x)  •  2^  +  z, 
for  x,  y  E  [0, 4]  and  z  E  [0,  2^  —  1],  where,  in  the  bit  vector,  bit  0  is  in  the  leftmost  position  and  bit 
25  •  2^  —  1  is  in  the  rightmost  position.  Thus  the  top  r  bits  of  the  sponge  construction  state  are  bits 
a[0]  through  to  a[r  —  1]  and  the  bottom  c  bits  are  a[r]  through  to  a[r  +  c  —  1]. 

All  that  remains  to  define  SHA-3  is  then  to  specify  the  permutation  p.  Just  like  the  AES  block 
cipher  the  construction  is  made  up  of  repeated  iteration  of  a  number  of  very  simple  transformations. 
In  particular  there  are  five  transformations  called  0,  p,  7r,  x  and  c  The  five  functions  are  defined  by, 
where  if  an  index  is  less  then  zero  or  too  large  we  assume  a  wrapping  around: 

•  6  :  For  all  x,y,z  apply  the  transform,  see  Figure  14.10, 


A[x,y,z 


4  4 

A[x,y,z\  0  +  A[x  -  1  ,y',z]  0  +  A[x  +  1  ,y',z-  1] 

y'= 0  y'=  0 


O 

The  author  extends  his  thanks  to  Joan  Daemen  in  helping  with  this  section,  and  to  the  entire  Keccak  team  for 
permission  to  include  the  figures  found  at  http://keccak.noekeon.org/. 
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bit 

Figure  14.9.  The  SHA-3  state  and  its  components 


P 


For  a  given  (x,y)  G  F|,  define  t  G  {0 

t 


5  •••  1 


23}  by  the  equation 


0  1 
2  3 


1 

0 


x 


y 


(mod  5), 


with  t  =  —  1  if  x  =  y  =  0.  Then  for  all  x,  y,  z  apply  the  transform,  see  Figure  14.10, 

A[x,  y,  z]  V-  A[x,  y,  (z  -  (t  +  1)  •  (t  +  2)/2)  (mod  2l) 

7 r  :  For  a  given  (x,y)  G  F|  dehne  (x',y')  by 


x 


y 


o  1 

2  3 


x 

y 


(mod  5), 
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/> 


6  Transform 


p  Transform 


Figure  14.10.  The  SHA-3  9  and  p  transforms 


then  for  all  x,y,z  apply  the  transform,  see  Figure  14.11, 


A[x',y',z]  <-  A[x,y,z\. 


•  x  :  For  all  x,y,z  apply  the  transform,  see  Figure  14.11, 


A[x,  y ,  z\  c-  A[x,  y,  z]  ©  (A[x  +  1,  y,  z\  +  1)  •  A[x  +  2,  y,  2 


Notice  that  this  is  a  non-linear  operation,  and  is  the  only  one  we  define  for  SHA-3. 

•  l  :  For  round  number  i  E  {0,  . . . ,  12  +  2  •  —  1}  we  dehne  a  round  constant  rc^.  In  round 
i  the  round  constant  rc^  is  added  to  the  (0, 0)-lane.  The  round  constants,  assuming  the 
standard  of  24  rounds,  are 


i 

rci 

i 

rci 

i 

r  Ci 

i 

r  Ci 

0 

0x0000000000000001 

1 

0x0000000000008082 

2 

0x800000000000808 A 

3 

0x8000000080008000 

4 

0x000000000000808B 

5 

0x0000000080000001 

6 

0x8000000080008081 

7 

0x8000000000008009 

8 

0x000000000000008A 

9 

0x0000000000000088 

10 

0x0000000080008009 

11 

0x000000008000000A 

12 

0x000000008000808B 

13 

0x800000000000008B 

14 

0x8000000000008089 

15 

0x8000000000008003 

16 

0x8000000000008002 

17 

0x8000000000000080 

18 

0x000000000000800A 

19 

0x800000008000000A 

20 

0x8000000080008081 

21 

0x8000000000008080 

22 

0x0000000080000001 

23 

0x8000000080008008 

These  constants  are  truncated  in  the  case  that  the  lane  is  not  64  bits  long. 

The  combination  of  these  five  transforms  forms  a  round.  Each  round  is  repeated  a  total  of  12  +  2  •£ 
times  to  dehne  the  permutation  p;  for  the  standard  configurations,  with  £  =  6,  this  means  the 
number  of  rounds  is  equal  to  24.  In  Algorithm  14.5  we  present  an  overview  of  the  p  function  in 
SHA-3,  which  is  called  Keccak -/. 

The  key  design  principle  is  that  we  want  to  create  an  avalanche  effect,  meaning  a  small  change 
in  the  state  between  two  invocations  should  result  in  a  massive  change  in  the  resulting  output. 
Each  of  the  five  basic  functions  makes  a  small  local  change  to  the  state,  but  combining  different 
axes  of  the  state.  Thus,  for  example,  the  p  transformation  works  by  diffusion  between  the  slices, 
much  like  ShiftRows  works  in  AES.  The  i r  transform  on  the  other  hand  works  on  each  slice  in 
turn,  in  a  method  reminiscent  of  MixColumns  from  AES,  l  adds  in  “round  constants”  in  a  method 
reminiscent  of  AddRoundKey  from  AES,  and  x  provides  non-linearity  per  round  in  much  the  same 
way  as  SubBytes  does  in  AES.  Finally,  6  provides  a  further  form  of  mixing  between  neighbouring 
columns.  As  all  these  steps  are  applied  more  than  2  •  £  times  every  entry  in  the  state  affects  every 
other  entry  in  a  non-linear  manner. 


14.8.3.  Sponges  With  Everything:  We  now  discuss  a  number  of  additional  functionalities 
which  arise  from  the  basic  sponge  construction,  thus  showing  its  utility  in  designing  other  primitives. 
As  one  can  see  from  all  of  these  constructions,  bar  that  of  an  IND-CCA  secure  encryption  scheme, 
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7 r  Transform 


/  /  /  /  /  /I 


X  Transform 


Figure  14.11.  The  SHA-3  tt  and  x  transforms 


Algorithm  14.5:  The  SHA-3  p  function:  Keccak-/ 

/*  Permutation  */ 

for  i  =  0  to  12  +  2  -  £  —  1  do 

A  <-  0(A). 

A  i —  p(A). 

A  4—  7 r(A). 

A  <-  xO). 

A  i —  A,  . 


they  are  less  complicated  than  their  equivalent  block-cipher-based  or  compression-function-based 
cousins. 

Sponge-Based  MAC:  Recall  that  message  authentication  codes  are  symmetric  keyed  primitives 
which  ensure  that  a  message  is  authentic,  i.e.  has  not  been  tampered  with  and  has  come  from  some 
other  party  possessing  the  same  symmetric  key.  If  we  take  the  key  k  G  {0,  l}r  and  the  message 
m  G  {0, 1}*,  and  then  apply  our  sponge  construction,  we  obtain  a  MAC  of  any  length  we  want,  r 
bits  at  a  time.  Since  the  sponge  “acts  like”  a  random  oracle  the  resulting  code  acts  like  a  random 
value,  and  so  the  only  way  for  an  adversary  to  win  the  EUF-CMA  game  is  to  obtain  a  collision  in 
the  sponge  construction  when  used  in  this  way.  Thus  the  sponge  construction  on  its  own,  with  a 
suitable  choice  of  the  permutation  p,  is  a  secure  message  authentication  code. 

Sponge-Based  KDF/Stream  Cipher:  A  sponge-based  hash  function  can  be  utilized  as  a  stream 
cipher,  and  hence  a  KDF,  in  a  relatively  simple  way.  We  take  the  key  k  and  append  an  TV  if  needed. 
The  result  is  padded  via  padding  method  four,  and  then  this  is  absorbed  into  the  sponge.  The 
keystream  is  then  squeezed  out  of  the  sponge  r  bits  at  a  time.  Again,  the  fact  that  the  sponge  “acts 
like”  a  random  oracle  ensures  that  the  keystream  is  truly  random  and  with  a  random  /  gives  us 
an  IND-CPA  secure  stream  cipher. 

Sponge-Based  IND-CCA  Secure  Encryption:  An  interesting  aspect  of  the  sponge  construction 
for  hash  functions  is  that  the  same  construction  can  be  used  to  produce  an  IND-CCA  secure  sym¬ 
metric  encryption  scheme.  This  construction  is  a  little  more  involved;  it  works  by  combining  the 
above  method  for  obtaining  a  stream  cipher  from  a  sponge,  and  the  method  for  obtaining  a  MAC 
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from  a  sponge,  in  an  Encrypt-then-MAC  methodology.  However,  for  efficiency,  the  tapping  of  the 
keystream  and  the  feeding  in  of  the  ciphertext  to  obtain  a  MAC  on  the  ciphertext  happen  simul¬ 
taneously.  To  do  this,  and  obtain  security,  a  special  form  of  padding  must  be  used  on  the  message; 
details  are  in  the  Further  Reading  section  of  this  chapter.  We  present  the  (simplified)  encryption 
and  decryption  operations  in  Figure  14.12. 


i  L 


r 


v 

A 


c 


A 


r 


'r 

A 


c 


Figure  14.12. 


IND-CCA  encryption  (above)  and  decryption  (below)  using  a  sponge  Sp \p\ 


Chapter  Summary 


•  Keyed  hash  functions  have  a  well-defined  security  model;  a  similar  definition  for  unkeyed 
hash  functions  is  harder  to  give.  So  we  have  to  rely  on  “human  ignorance” . 

•  Due  to  the  birthday  paradox  when  collision  resistance  is  a  requirement,  the  output  of  the 
hash  function  should  be  at  least  twice  the  size  of  what  one  believes  to  be  the  limit  of  the 
computational  ability  of  the  attacker. 

•  Most  hash  functions  are  iterative  in  nature,  although  most  of  the  currently  deployed  ones 
from  the  MD4  family  have  been  shown  to  be  weaker  than  expected. 
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•  The  current  best  hash  functions  are  SHA-2,  based  on  the  Merkle-Damgard  construction, 
and  SHA-3,  which  is  a  so-called  sponge  construction. 

•  A  message  authentication  code  is  in  some  sense  a  keyed  hash  function,  whilst  a  key  deriva¬ 
tion  function  is  a  hash  function  with  arbitrary  length  codomain. 

•  Message  authentication  codes  and  key  derivation  functions  can  be  created  out  of  either 
block  ciphers  or  hash  functions. 


Further  Reading 

A  detailed  description  of  both  SHA-1  and  the  SHA-2  algorithms  can  be  found  in  the  FIPS  standard 
below;  this  includes  a  set  of  test  vectors  as  well.  The  proof  of  security  of  EM  AC  is  given  in  the 
paper  by  Petrank  and  Rackoff,  where  it  is  called  DM  AC.  Details  of  SHA-3  can  be  found  on  the 
Keccak  website  given  below. 

FIPS  PUB  180-4,  Secure  Hash  Standard  (SHS).  NIST,  2012. 

G.  Bertoni,  J.  Daemen,  M.  Peeters  and  G.  Van  Assche.  The  Keccak  Sponge  Function  Family. 
http : / /keccak . noekeon . org/. 

E.  Petrank  and  C.  Rackoff.  CBC  MAC  for  real-time  data  sources.  Journal  of  Cryptology,  13, 
315-338,  2000. 


CHAPTER  15 


The  “Naive”  RSA  Algorithm 


Chapter  Goals 


•  To  understand  the  naive  RSA  encryption  algorithm  and  the  assumptions  on  which  its 
security  relies. 

•  To  do  the  same  for  the  naive  RSA  signature  algorithm. 

•  To  show  why  these  naive  versions  cannot  be  considered  secure. 

•  To  explain  Wiener’s  attack  on  RSA  using  continued  fractions. 

•  To  explain  how  to  use  Coppersmith’s  Theorem  for  finding  small  roots  of  modular  polyno¬ 
mial  equations  to  extend  this  attack  to  other  situations. 

•  To  introduce  the  notions  of  partial  key  exposure  and  fault  analysis. 

15.1.  “Naive”  RSA  Encryption 

The  RSA  algorithm  was  the  world’s  first  public  key  encryption  algorithm,  and  it  has  stood  the 
test  of  time  remarkably  well.  The  RSA  algorithm  is  based  on  the  difficulty  of  the  RSA  problem 
considered  in  Chapter  2,  and  hence  it  is  based  on  the  difficulty  of  finding  the  prime  factors  of 
large  integers.  However,  we  have  seen  that  it  may  be  possible  to  solve  the  RSA  problem  without 
factoring,  hence  the  RSA  algorithm  is  not  based  completely  on  the  difficulty  of  factoring. 

Suppose  Alice  wishes  to  enable  anyone  to  send  her  secret  messages,  which  only  she  can  decrypt. 
She  first  picks  two  large  secret  prime  numbers  p  and  q.  Alice  then  computes 

A  =  p  •  q. 

Alice  also  chooses  an  encryption  exponent  e  which  satisfies 

gcd(e,  0—1  )-(q-  1))  =  1. 

It  is  common  to  choose  e  =  3, 17  or  65  537.  Now  Alice’s  public  key  is  the  pair  pt  =  (A,  e),  which  she 
can  publish  in  a  public  directory.  To  compute  her  private  key  Alice  applies  the  extended  Euclidean 
algorithm  to  e  and  (p  —  1  )(q  —  1)  to  obtain  the  decryption  exponent  d,  which  should  satisfy 

e  •  d  =  1  (mod  (p  —  1  )(q  —  1)). 

Alice  keeps  secret  her  private  key,  which  is  the  triple  si  =  (d,p,q).  Actually,  she  could  simply 
throw  away  p  and  g,  and  retain  a  copy  of  her  public  key  which  contains  the  integer  A,  but  as  we 
saw  in  Chapter  6  holding  onto  the  prime  factors  can  aid  the  exponentiation  algorithm  modulo  A. 

Now  suppose  Bob  wishes  to  encrypt  a  message  to  Alice.  He  first  looks  up  Alice’s  public  key 
and  represents  the  message  as  a  number  m  which  is  strictly  less  than  the  public  modulus  A.  The 
ciphertext  is  then  produced  by  raising  the  message  to  the  power  of  the  public  encryption  exponent 
modulo  the  public  modulus,  i.e. 

c  <—  me  (mod  A). 
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On  receiving  c  Alice  can  decrypt  the  ciphertext  to  recover  the  message  by  exponentiating  by  the 
private  decryption  exponent,  i.e. 

m  V-  cri  (mod  TV). 

This  works  since  the  group  ( Z/TVZ )*  has  order 

4>(N )  =  (p- 1)0?  -  i) 

and  so,  by  Lagrange’s  Theorem, 

=  1  (mod  TV), 

for  all  x  G  (Z/TVZ)*.  Thus,  for  some  integer  5  we  have 

ed  —  s  •  (p  —  1)  •  (q  —  1)  =  1, 


and  so 


=  m 


e-d 


=  m. 

To  make  things  clearer  let’s  consider  a  baby  example.  Choose  p  =  7  and  q  —  11,  and  so  TV  =  77 
and  (p  —  1)  •  (q  —  1)  =  6*10  =  60.  We  pick  as  the  public  encryption  exponent  e  =  37,  since  we  have 
gcd(37,  60)  =  1.  Then,  applying  the  extended  Euclidean  algorithm  we  obtain  d=  13  since 

37-13  =  481  =  1  (mod  60). 

Suppose  the  message  we  wish  to  transmit  is  given  by  m  =  2,  then  to  encrypt  m  we  compute 

c  <—  rrf  (mod  TV)  =  237  (mod  77)  =  51, 
whilst  to  decrypt  the  ciphertext  c  we  compute 

m  V-  cd  (mod  TV)  =  5113  (mod  77)  =  2. 


The  security  of  RSA  on  first  inspection  relies  on  the  difficulty  of  finding  the  private  encryption 
exponent  d  given  only  the  public  key,  namely  the  public  modulus  TV  and  the  public  encryption 
exponent  e.  In  Chapter  2  we  showed  that  the  RSA  problem  is  no  harder  than  FACTOR,  hence  if 
we  can  factor  TV  then  we  can  find  p  and  q  and  hence  we  can  calculate  d.  Hence,  if  factoring  is  easy 
we  can  break  RSA.  Currently  it  is  recommended  that  one  takes  public  moduli  of  size  around  2048 
bits  to  ensure  medium-term  security. 

Recall,  from  Chapter  11,  that  for  a  public  key  algorithm  the  adversary  always  has  access  to  the 
encryption  algorithm,  hence  she  can  always  mount  a  chosen  plaintext  attack.  We  can  show  that 
RSA  encryption  meets  our  weakest  notion  of  security  for  public  key  encryption,  namely  OW-CPA, 
assuming  that  the  RSA  problem  is  hard.  To  show  this  we  use  the  reduction  arguments  of  previous 
chapters.  This  example  is  rather  trivial  but  we  labour  the  point  since  these  arguments  are  used 
over  and  over  again. 

Lemma  15.1.  If  the  RSA  problem  is  hard  then  the  naive  RSA  encryption  scheme  is  OW-CPA  se¬ 
cure.  In  particular  if  A  is  an  adversary  which  breaks  the  OW-CPA  security  of  naive  RSA  encryption 
for  RSA  moduli  of  v  bits  in  length,  then  there  is  an  adversary  B  such  that 

Adv°^A(yA(V  =  Adv^RSA  (B). 

Proof.  We  wish  to  give  an  algorithm  which  solves  the  RSA  problem  using  an  algorithm  to  break 
the  RSA  cryptosystem  as  an  oracle.  If  we  can  show  this  then  we  can  conclude  that  breaking  the 
RSA  cryptosystem  is  no  harder  than  solving  the  RSA  problem. 

Recall  that  the  RSA  problem  is  given  TV  =  p-  q,  e  and  y  G  (Z/TVZ)*,  compute  an  x  such  that  xe 
(mod  TV)  =  y.  We  use  our  oracle  to  break  the  RSA  encryption  algorithm  to  “decrypt”  the  message 
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corresponding  to  c  —  y\  this  oracle  will  return  the  plaintext  message  m.  Then  our  RSA  problem  is 
solved  by  setting  x  <—  m  since,  by  definition, 

rrf  (mod  N)  =  c  =  y. 

So  if  we  can  break  the  RSA  algorithm  then  we  can  solve  the  RSA  problem.  □ 

This  is,  however,  the  best  we  can  hope  for  since  naive  RSA  encryption  is  deterministic,  and  thus 
we  have  the  following. 

Lemma  15.2.  Naive  RSA  encryption  is  not  IND-CPA  secure. 

Proof.  Recall  from  Chapter  11  that  in  the  IND-CPA  game  the  adversary  produces  two  plaintexts, 
which  we  shall  denote  by  mo  and  m\.  The  challenger,  then  encrypts  one  of  them  to  obtain  the 
challenge  ciphertext  c*  <—  (mod  IV),  for  some  hidden  bit  b  G  {0, 1}.  The  adversary’s  goal  is 
then  to  determine  b.  This  task  can  be  easily  accomplished  since  all  the  adversary  needs  to  do  is  to 
compute  c  <—  ml  (mod  IV),  and  then 

•  if  c*  =  c  then  the  attacker  knows  that  =  mi, 

•  if  c*  /  c  then  the  attacker  knows  that  =  mo. 

□ 


The  problem  is  that  the  attacker  has  access  to  the  encryption  function;  it  is  a  public  key  scheme 
after  all.  But  using  a  deterministic  encryption  function  is  not  the  only  problem  with  RSA,  since 
RSA  is  also  malleable  due  to  the  homomorphic  property. 

Definition  15.3  (Homomorphic  Property).  An  encryption  scheme  has  the  (multiplicative)  homo¬ 
morphic  property  if  given  the  encryptions  of  mi  and  m2  we  can  determine  the  encryption  ofm\-m2, 
without  knowing  mi  or  m2 . 

A  similar  definition  can  be  given  for  additive  homomorphisms  as  well.  That  RSA  has  the  homo¬ 
morphic  property  follows  from  the  equation 

(mi  •  m2)e  (mod  N )  =  ((mie  (mod  N))  •  (m26  (mod  N)))  (mod  N). 

One  can  use  the  homomorphic  property  to  show  that  RSA  is  not  even  one-way  secure  under  an 
adaptive  chosen  ciphertext  attack. 

Lemma  15.4.  Naive  RSA  encryption  is  not  OW-CCA  secure. 


Proof.  Recall  from  Chapter  11  that  the  OW-CCA  game  is  like  the  OW-CPA  game,  except  that  the 
adversary  now  has  access  to  a  decryption  oracle  which  can  decrypt  any  ciphertext,  bar  the  target 
ciphertext  c*.  Suppose  the  challenger  gives  the  adversary  the  challenge  ciphertext 

c*  =  (m*)e  (mod  N). 


The  goal  of  the  adversary  is  to  find  m*.  The  adversary  then  creates  the  “related”  ciphertext 
c  =  2e  •  c*  and  asks  her  decryption  oracle  to  decrypt  c  to  produce  m.  Notice  that  this  is  a  legal 
query  under  the  rules  of  the  game  as  c/  c*.  The  adversary  can  then  compute 


m  cd  (2e  •  c*)d 
2  ~  ~2  ~  2 

_  2e'd  •  (c*)d  _  2  •  m* 
“  2  “  2 


□ 


In  Chapter  16  we  will  present  variants  of  RSA  encryption,  as  well  as  other  public  key  encryption 
schemes,  and  show  that  they  are  secure  in  the  sense  of  IND-CCA.  It  is  these  more  advanced 
algorithms  which  one  should  use  in  practice,  and  this  is  why  we  have  dubbed  the  above  method  of 
encrypting  “naive”  RSA. 
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15.1.1.  Rabin  Encryption:  The  obvious  question  is  whether  we  can  construct  an  OW-CPA  secure 
scheme,  which  is  efficient,  and  which  is  based  on  the  factoring  problem  itself.  The  Rabin  encryption 
scheme  is  one  such  scheme;  it  replaces  the  RSA  map  (which  is  a  permutation  on  the  group  (Z/7VZ)*) 
with  a  map  which  is  not  injective.  The  Rabin  scheme,  due  to  Michael  Rabin,  bases  its  security  on 
the  difficulty  of  extracting  square  roots  modulo  N  =  p  -  q,  for  two  large  unknown  primes  p  and  q. 
Recall  Theorem  2.4,  which  showed  that  this  SQRROOT  problem  is  polynomial-time  equivalent  to 
the  factoring  problem. 

Despite  these  plus  points  the  Rabin  system  is  not  used  as  much  as  the  RSA  system.  It  is, 
however,  useful  to  study  for  a  number  of  reasons,  both  historical  and  theoretical.  The  basic  idea 
of  the  system  is  also  used  in  some  higher- level  protocols. 

Key  Generation:  We  first  choose  prime  numbers  of  the  form  p  —  q  —  3  (mod  4),  since  this  makes 
extracting  square  roots  modulo  p  and  q  very  fast.  The  private  key  is  then  the  pair  st  (p,  g),  and 
the  public  key  is  pt  V-  N  =  p  •  q. 

Encryption:  To  encrypt  a  message  m  using  the  above  public  key,  in  the  Rabin  encryption  algo¬ 
rithm  we  compute  c  <—  m2  (mod  N).  Hence,  encryption  involves  one  multiplication  modulo  N, 
and  is  therefore  much  faster  than  RSA  encryption,  even  when  one  chooses  a  small  RSA  encryption 
exponent.  Indeed,  encryption  in  the  Rabin  encryption  system  is  much  faster  than  almost  any  other 
public  key  scheme. 

Decryption:  Decryption  is  far  more  complicated;  essentially  we  want  to  compute  the  value  of 
m  =  yfc  (mod  N).  At  first  sight  this  uses  no  private  information,  but  a  moment’s  thought  reveals 
that  one  needs  the  factorization  of  N  to  be  able  to  find  the  square  root.  In  particular  to  compute 
m  one  computes 

mp  =  yfc  (mod  p)  =  c^p+1^4  (mod  p)  =  ±35, 
mq  =  yfc  (mod  q)  =  c^+1^4  (mod  q)  =  ±44, 

and  then  combines  mp  and  mq  by  the  Chinese  Remainder  Theorem.  There  are  however  four  possible 
square  roots  modulo  N,  since  N  is  the  product  of  two  primes.  Hence,  on  decryption  we  obtain  four 
possible  plaintexts.  This  means  that  we  need  to  add  redundancy  to  the  plaintext  before  encryption 
in  order  to  decide  which  of  the  four  possible  plaintexts  corresponds  to  the  intended  one. 

Example:  Let  the  private  key  be  given  by  p  —  127  and  q  =  131,  and  the  public  key  be  given  by 
N  =  16  637.  To  encrypt  the  message  m  =  4410  we  compute 

c  =  m2  (mod  N)  =  16  084. 

To  decrypt  we  evaluate  the  square  root  of  c  modulo  p  and  q 

mp  =  yfc  (mod  p)  =  ±35, 
rriq  =  yfc  (mod  q)  =  ±44. 

Now  we  apply  the  Chinese  Remainder  Theorem  to  both  ±35  (mod  p)  and  ±44  (mod  q)  so  as  to 
find  the  square  root  of  c  modulo  N, 

s  =  yfc  (mod  N)  =  ±4410  and  ±1  616. 

This  leaves  us  with  the  four  “messages” 

1616,  4410,  12227,  or  15021. 


It  should  be  pretty  clear  that  any  adversary  breaking  the  OW-CPA  security  of  the  Rabin  encryption 
scheme  can  immediately  be  turned  into  an  adversary  to  break  the  SQRROOT  problem,  and  hence 
into  an  adversary  to  factor  integers.  So  we  have  the  following. 
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Theorem  15.5.  If  A  is  an  adversary  against  the  OW-CPA  security  of  the  Rabin  encryption  scheme 
II  for  moduli  of  size  v  bits ,  then  there  is  an  adversary  B  against  the  factoring  problem  with 

Adv£w-CPAG)  =  2  •  AdvPACT0R(S), 

the  factor  of  two  coming  from  Lemma  2. 6. 

It  should  also  be  pretty  obvious  that  the  scheme  is  not  OW-CCA  secure,  since  the  scheme  is  obviously 
malleable  in  the  same  way  that  RSA  was.  In  addition  it  is  clearly  not  IND-CPA  as  it  is  deterministic, 
like  RSA. 


15.2.  “Naive”  RSA  Signatures 


The  RSA  encryption  algorithm  is  particularly  interesting  since  it  can  be  used  directly  as  a  so-called 
signature  algorithm  with  message  recovery. 

•  The  sender  applies  the  RSA  decryption  transform  to  generate  the  signature,  by  taking  the 
message  and  raising  it  to  the  private  exponent  d 

s  V-  md  (mod  N ). 

•  The  receiver  then  applies  the  RSA  encryption  transform  to  recover  the  original  message 


m  V-  se  (mod  N ). 


But  this  raises  the  question;  how  do  we  check  the  validity  of  the  signature?  If  the  original  message 
is  in  a  natural  language  such  as  English  then  one  can  verify  that  the  extracted  message  is  also  in 
the  same  natural  language.  But  this  is  not  a  solution  for  all  possible  messages.  Hence  one  needs 
to  add  redundancy  to  the  message. 

One,  almost  prehistoric,  way  of  doing  this  in  the  early  days  of  public  key  cryptography  was 
the  following.  Suppose  the  message  m  is  t  bits  long  and  the  RSA  modulus  N  is  k  bits  long,  with 
t  <  k  —  32.  We  first  pad  m  to  the  right  with  zeros  to  produce  a  string  of  length  a  multiple  of  eight. 
We  then  add  (k  —  t)/S  bytes  to  the  left  of  m  to  produce  a  byte-string 


m  v-  00  01  FF  FF  . . .  FF  00 


m. 


The  signature  is  then  computed  via 

md  (mod  N ). 

When  verifying  the  signature  we  ensure  that  the  recovered  value  of  m  has  the  correct  padding.  This 
form  of  padding  also  seems  to  prevent  the  following  trivial  existential  forgery  attack.  The  attacker 
picks  5  at  random  and  then  computes 

m  V-  se  (mod  N ). 

The  attacker  then  has  the  signature  5  on  the  message  m. 

Moreover,  the  padding  scheme  also  seems  to  prevent  selective  forgeries,  which  are  in  some  sense 
an  even  worse  form  of  weakness.  Without  the  padding  scheme  we  can  produce  a  selective  forgery, 
using  access  to  a  signing  oracle,  as  follows.  Suppose  the  attacker  wishes  to  produce  a  signature  5 
on  the  message  m.  She  first  generates  a  random  m\  E  (Z/iVZ)*  and  computes 

m 

m2  < - . 

m  i 

Then  the  attacker  asks  her  oracle  to  sign  the  messages  m\  and  m2.  This  results  in  two  signatures 
si  and  £>2  such  that 

si  =  md  (mod  N). 

The  attacker  can  then  compute  the  signature  on  the  message  m  by  computing 

s  i —  Si  •  S2  (mod  N ) 


300 


15.  THE  “NAIVE”  RSA  ALGORITHM 


since 


s  =  si  -  S2  (mod  N) 

=  rri\  •  m2  (mod  N) 

=  (mi  •  rri2)d  (mod  N ) 

=  rnd  (mod  N). 

Here  we  have  used  once  more  the  homomorphic  property  of  the  RSA  function,  just  as  we  did  when 
we  showed  that  RSA  encryption  was  not  OW-CCA  secure. 

But  not  all  messages  will  be  so  short  so  as  to  fit  into  the  above  method.  Hence,  naively  to 
apply  the  RSA  signature  algorithm  to  a  long  message  m  we  need  to  break  it  into  blocks  and  sign 
each  block  in  turn.  This  is  very  time-consuming  for  long  messages.  Worse  than  this,  we  must  add 
serial  numbers  and  more  redundancy  to  each  message  otherwise  an  attacker  could  delete  parts  of 
the  long  message  without  us  knowing,  just  as  could  happen  when  encrypting  using  a  block  cipher 
in  ECB  Mode.  This  problem  arises  because  our  signature  model  is  one  giving  message  recovery, 
i.e.  the  message  is  recovered  from  the  signature  and  the  verification  process.  If  we  used  a  model 
called  a  signature  with  appendix  then  we  could  first  produce  a  hash  of  the  message  to  be  signed 
and  then  just  sign  the  hash. 

Using  a  cryptographic  hash  function  H ,  such  as  those  described  in  Chapter  14,  it  is  possible 
to  make  RSA  into  a  signature  scheme  without  message  recovery,  which  is  very  efficient  for  long 
messages.  Suppose  we  are  given  a  long  message  m  for  signing;  we  first  compute  H(m)  and  then 
apply  the  RSA  signing  transform  to  H(m),  i.e.  the  signature  is  given  by 

5  <—  H{m)d  (mod  N). 

The  signature  and  message  are  then  transmitted  together  as  the  pair  (m,  s).  Verifying  a  mes¬ 
sage/signature  pair  (m,  s)  generated  using  a  hash  function  involves  three  steps. 

•  “Encrypt”  s  using  the  RSA  encryption  function  to  recover  h,  i.e. 

h  <—  se  (mod  N ). 


•  Compute  H(m)  from  rn. 

•  Check  whether  h  =  H(m).  If  they  agree  accept  the  signature  as  valid,  otherwise  the 
signature  should  be  rejected. 

Since  a  hash  function  does  not  usually  have  codomain  the  whole  of  the  integers  modulo  V,  in 
practice  one  needs  to  first  hash  and  then  pad  the  message.  We  could,  for  example,  use  the  padding 
scheme  given  earlier  when  we  discussed  RSA  with  message  recovery.  If  we  assume  the  hash  function 
produces  a  value  in  the  range  [0, . . . ,  N  —  1]  then  the  above  scheme  is  called  RSA-FDH,  for  RSA 
Full  Domain  Hash.  The  name  is  because  the  codomain  of  the  hash  function  is  the  entire  domain 
of  the  RSA  function. 

Recall  that  when  we  discussed  cryptographic  hash  functions  we  said  that  they  should  satisfy 
the  following  three  properties: 

(1)  Preimage  Resistant:  It  should  be  hard  to  find  a  message  with  a  given  hash  value. 

(2)  Collision  Resistant:  It  should  be  hard  to  find  two  messages  with  the  same  hash  value. 

(3)  Second  Preimage  Resistant:  Given  one  message  it  should  be  hard  to  find  another 
message  with  the  same  hash  value. 

It  turns  out  that  all  three  properties  are  needed  when  using  the  RSA-FDH  signing  algorithm,  as 
we  shall  now  show. 
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Requirement  for  Preimage  Resistance:  The  one-way  property  stops  a  cryptanalyst  from  cook¬ 
ing  up  a  message  with  a  given  signature.  For  example,  suppose  we  are  using  RSA-FDH  but  with 
a  hash  function  which  does  not  have  the  one-way  property.  We  then  have  the  following  attack. 

•  The  adversary  computes 

h  <—  re  (mod  N) 

for  some  random  integer  r. 

•  The  adversary  also  computes  the  preimage  of  h  under  H  (recall  we  are  assuming  that  H 
does  not  have  the  one-way  property),  i.e.  Eve  computes 

The  adversary  now  has  your  signature  (m,  r)  on  the  message  m.  Recall  that  such  a  forgery  is  called 
an  existential  forgery,  since  the  attacker  may  not  have  any  control  over  the  contents  of  the  message 
on  which  she  has  obtained  a  digital  signature. 

Requirement  for  Collision  Resistance:  This  is  needed  to  avoid  the  following  attack,  which  is 
performed  by  the  legitimate  signer. 

•  The  signer  chooses  two  messages  m  and  m!  with  H(m)  =  H(m'). 

•  They  sign  m  and  output  the  signature  (m,  s). 

•  Later  they  repudiate  this  signature,  saying  it  was  really  a  signature  on  the  message  mb 

As  a  concrete  example  one  could  have  that  m  is  an  electronic  cheque  for  1  000  euros  whilst  m'  is 
an  electronic  cheque  for  10  euros. 

Requirement  for  Second  Preimage  Resistance:  This  property  is  needed  to  stop  the  following 
attack. 

•  An  attacker  obtains  your  signature  (m,  s)  on  a  message  m. 

•  The  attacker  finds  another  message  m!  with  H(m!)  =  H(m). 

•  The  attacker  now  has  your  signature  (m',s)  on  the  message  mb 

Thus,  the  security  of  any  signature  scheme  which  uses  a  cryptographic  hash  function  will  depend 
both  on  the  security  of  the  underlying  hard  mathematical  problem,  such  as  factoring  or  the  discrete 
logarithm  problem,  and  the  security  of  the  underlying  hash  function.  In  Chapter  16  we  will  present 
some  variants  of  the  RSA  signature  algorithm,  as  well  as  others,  and  show  that  they  are  secure  in 
the  sense  of  EUF-CMA. 


15.3.  The  Security  of  RSA 

In  this  section  we  examine  in  more  detail  the  security  of  the  RSA  function,  and  the  resulting 
“naive”  encryption  and  signature  algorithms.  In  particular  we  show  how  knowing  the  private  key 
is  equivalent  to  factoring  (which  should  be  contrasted  with  being  able  to  invert  the  function,  namely 
the  RSA  problem),  how  knowledge  of  <p(N)  is  also  equivalent  to  factoring,  how  sharing  a  modulus 
can  be  a  bad  idea,  and  how  also  having  a  small  public  exponent  could  introduce  problems  as  well. 

15.3.1.  Knowledge  of  the  Private  Exponent  and  Factoring:  Whilst  it  is  unclear  whether 
breaking  RSA,  in  the  sense  of  inverting  the  RSA  function,  is  equivalent  to  factoring,  determining 
the  private  key  d  given  the  public  information,  N  and  e,  is  equivalent  to  factoring.  The  algorithm 
in  the  next  proof  is  an  example  of  a  Las  Vegas  algorithm:  It  is  probabilistic  in  nature  in  the  sense 
that  whilst  it  may  not  actually  give  an  answer  (or  terminate),  it  is  however  guaranteed  that  when 
it  does  give  an  answer  then  that  answer  will  always  be  correct. 

Lemma  15.6.  If  one  knows  the  RSA  decryption  exponent  d  corresponding  to  the  public  key  (V,  e) 
then  one  can  efficiently  factor  N. 
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Proof.  Recall  that  for  some  integer  5 

e  •  d  —  1  =  s  •  (p  —  1)  •  (q  —  1). 

We  first  pick  an  integer  r/0,  this  is  guaranteed  to  satisfy 

Xe'd~  1  =  1  (mod  N). 

We  now  compute  a  square  root  y\  of  one  modulo  IV, 

yi  <-  \fxe-d~l  =  x(e'd_1)/2, 

which  we  can  do  since  e  •  d  —  1  is  known  and  will  be  even.  We  will  then  have  the  identity 

yi2  —  1  =  0  (mod  IV), 

which  we  can  use  to  recover  a  factor  of  N  via  computing 

gcd(j/i  -  1,  N). 

But  this  will  only  work  when  ^  7^  ±1  (mod  N ). 

Now  suppose  we  are  unlucky  and  we  obtain  y\  =  ±1  (mod  N)  rather  than  a  factor  of  N.  If 

yi  =  —  1  (mod  IV),  then  we  set  y\  < - y\.  Thus  we  are  always  left  with  the  case  y\  =  1  (mod  N). 

We  take  another  square  root  of  one  via 

=  aVd-1)/4. 

Again  we  have 

1/2 2  —  1  =  1/1  —  1  =  0  (mod  IV). 

Hence  we  compute 

gcd(y2  -  1,  AO 

and  see  whether  this  gives  a  factor  of  N.  Again  this  will  give  a  factor  of  N  unless  y<i  —  ±1.  If  we 
are  unlucky  we  repeat  once  more  and  so  on. 

This  method  can  be  repeated  until  either  we  have  factored  N  or  until  (e  •  d  —  l)/2t  is  no  longer 
divisible  by  2.  In  this  latter  case  we  return  to  the  beginning,  choose  a  new  random  value  of  x  and 
start  again.  □ 

We  shall  now  present  a  small  example  of  the  previous  method.  Consider  the  following  RSA  param¬ 
eters 

N  =  1  441  499,  e  =  17  and  d  =  507  905. 

Recall  that  we  are  assuming  that  the  private  exponent  d  is  public  knowledge.  We  will  show  that 
the  previous  method  does  in  fact  find  a  factor  of  N.  Put 

ti  v-  (e-d  — l)/2  =  4317192, 

X  i —  2. 

To  compute  yi  we  evaluate 

yi  <-  x(e'd_1)/2  =  2*1  =  1  (mod  N). 

Since  we  obtain  y\  =  1  we  need  to  set 

t2  <-  t\/2  =  (e-d-  l)/4  =  2158  596, 

2/2 

We  now  compute  1/2 , 

2/2  =  2t<2  =  1  (mod  N). 

So  we  need  to  repeat  the  method  again;  this  time  we  obtain  £3  =  (e  •  d  —  l)/8  =  1079  298.  We 
compute  1/3, 

y3  =  X(e'd-1)/8  =  2*3  =  119  533  (mod  N). 
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So 

V3  -  1  =  (2/3  -  1)  •  (2/3  +  1)  =  o 
and  we  compute  a  prime  factor  of  N  by  evaluating  gcd(?/3  — 


(mod  N ), 

1,  N)  =  1423. 


15.3.2.  Knowledge  of  <f(N)  and  Factoring:  We  have  seen  that  knowledge  of  d  allows  us  to 
factor  N.  Now  we  will  show  that  knowledge  of  4>  =  (j)(N)  also  allows  us  to  factor  N. 

Lemma  15.7.  Given  an  RSA  modulus  N  and  the  value  of  =  <f(N)  one  can  efficiently  factor  N. 
Proof.  We  have 

4>  =  (p  -  1)  •  (q  -  1)  =  N  -  (p  +  q)  +  1. 

Hence,  if  we  set  S  =  N  +  1  —  4>,  we  obtain 

S  =  p  +  q. 

So  we  need  to  determine  p  and  q  from  their  sum  S  and  product  N.  Define  the  polynomial 

f(X)  =  (X  -  p)  ■  {X  -  q)  =  X2  -  S  ■  X  +  N. 

So  we  can  find  p  and  q  by  solving  f(X)  =  0  using  the  standard  formulae  for  extracting  the  roots 
of  a  quadratic  polynomial, 

S  +  vs2  —  4  •  N 

P  = - 2 - ’ 

S  -  VS2  -A- N 

q  =  - . 

2 

□ 


As  an  example  consider  the  RSA  public  modulus  IV  =  18  923.  Assume  that  we  are  given  4>  = 
4>(N)  =  18  648.  We  then  compute 

S  =  pJrq  =  N-\- 1  —  4>  =  276. 

Using  this  we  compute  the  polynomial 

f{X)  =  X2  -  S  ■  X  +  N  =  X2  -  276  •  X  +  18  923 

and  find  that  its  roots  over  the  real  numbers  are  p  =  149,  q  =  127,  which  are  indeed  the  factors  of 
N. 

15.3.3.  Use  of  a  Shared  Modulus:  Since  modular  arithmetic  is  very  expensive  it  can  be  very 
tempting  to  set  up  a  system  in  which  a  number  of  users  share  the  same  public  modulus  N  but 
use  different  public/private  exponents,  (e,i,dj).  One  reason  to  do  this  could  be  to  allow  very  fast 
hardware  acceleration  of  modular  arithmetic,  specially  tuned  to  the  chosen  shared  modulus  N. 
This  is,  however,  a  very  silly  idea  since  it  can  be  attacked  in  one  of  two  ways,  either  by  a  malicious 
insider  or  by  an  external  attacker. 

Suppose  the  attacker  is  one  of  the  internal  users,  say  user  number  one.  She  can  now  compute 
the  value  of  the  decryption  exponent  for  user  number  two,  namely  cfo.  First  user  one  computes  p 
and  q  since  she  knows  via  the  algorithm  in  the  proof  of  Lemma  15.6.  Then  user  one  computes 
4>(N)  =  (p  —  1)  •  (q  —  1),  and  finally  she  can  recover  cU  from 

d/2  =  —  (mod 

e2 

Now  suppose  the  attacker  is  not  one  of  the  people  who  share  the  modulus,  and  that  the  two 
public  exponents  e\  and  e2  are  coprime.  We  now  present  an  attack  against  the  “naive”  RSA 
encryption  algorithm  in  this  setting.  Suppose  Alice  sends  the  same  message  m  to  two  of  the  users 
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with  public  keys  (TV,  e\)  and  (TV,  e2),  i.e.  TVi  =  TV2  =  TV.  Eve,  the  external  attacker,  sees  the 
messages  c\  and  C2,  where  the  ciphertexts  are  derived  by  executing 

c  1  V-  mei  (mod  TV), 

C2  V-  rrf2  (mod  TV). 

Eve  can  now  compute 

t\  V-  ei-1  (mod  62), 
h  <-  (h  •  ei  -  l)/e2, 

and  can  recover  the  message  m  from 

c^1  •  c^2  =  mei'tlm~e2't 2 

—  m1~^e2't2m~e2't 2 

—  m^+e2-t2-e2-t2 

=  m1  =  m. 

As  an  example  of  this  external  attack,  take  the  public  keys  to  be 

TV  =  TVi  =  TV2  =  18  923,  a  =  11  and  e2  =  5. 

Now  suppose  Eve  sees  the  ciphertexts 

ci  =  1514  and  c 2  =  8189 

corresponding  to  the  same  plaintext  mn.  Then  Eve  computes  t\  —  1  and  £2  =  2,  and  recovers  the 
message 

m  =  c^1  •  c. 2*2  =  100  (mod  TV). 

15.3.4.  Use  of  a  Small  Public  Exponent:  In  practice  RSA  systems  often  use  a  small  public 
exponent  e  so  as  to  cut  down  the  computational  cost  of  the  sender.  We  shall  now  show  that  this 
can  also  lead  to  problems  for  the  RSA  encryption  algorithm.  Suppose  we  have  three  users  all  with 
different  public  moduli  TVi,  TV2  and  TV3.  In  addition  suppose  they  all  have  the  same  small  public 
exponent  e  =  3.  Suppose  someone  sends  them  the  same  message  m.  The  attacker  Eve  sees  the 
messages 

n 

ci  m  (mod  TVi), 

C2  rri  (mod  IV2), 

n 

C3  rri  (mod  TV3). 

Now  the  attacker,  using  the  Chinese  Remainder  Theorem,  computes  the  simultaneous  solution  to 
the  equations 

X  —  (mod  Ni)  for  z  =  1,2,  3, 

to  obtain 

X  =  m3  (mod  TVi  •  TV2  •  TV3). 

But  since  m3  <  TVi  •  TV2  •  TV3  we  must  have  X  =  m3  identically  over  the  integers.  Hence  we  can 
recover  m  by  taking  the  real  cube  root  of  X. 

As  a  simple  example  of  this  attack;  take  TVi  =  323,  TV2  =  299  and  TV3  =  341.  Suppose  Eve  sees 
the  ciphertexts 

ci  =  50,  C2  =  268  and  C3  =  1, 

and  wants  to  determine  the  common  value  of  m.  Eve  computes  via  the  Chinese  Remainder  Theorem 

X  =  300  763  (mod  TVi  •  TV2  •  TV3). 

Finally,  she  computes  over  the  integers  m  =  X1/3  =  67. 
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This  attack  and  the  previous  one  are  interesting  since  we  find  the  message  without  factoring 
the  modulus.  This  is,  albeit  slight,  evidence  that  breaking  RSA  is  easier  than  factoring.  The 
main  lesson,  however,  from  both  these  attacks  is  that  plaintext  should  be  randomly  padded  before 
transmission.  That  way  the  same  “message”  is  never  encrypted  to  two  different  people.  In  addition 
one  should  probably  avoid  very  small  exponents  for  encryption;  e  =  65  537  is  the  usual  choice  now 
in  use.  However,  small  public  exponents  for  RSA  signatures  produce  no  such  problems.  So  for  RSA 
signatures  it  is  common  to  see  e  =  3  being  used  in  practice. 


15.4.  Some  Lattice-Based  Attacks  on  RSA 

In  this  section  we  examine  how  lattices  can  be  used  to  attack  certain  systems,  when  some  other 
side  information  is  known. 


15.4.1.  Hastad’s  Attack:  Earlier  in  this  chapter  we  saw  the  following  attack  on  the  RSA  system: 
Given  three  public  keys  (A^,  ei)  all  with  the  same  encryption  exponent  =  3,  if  a  user  sent  the  same 
message  to  all  three  public  keys  then  an  adversary  could  recover  the  plaintext  using  the  Chinese 
Remainder  Theorem.  Suppose  that  we  try  to  protect  against  this  attack  by  insisting  that  before 
encrypting  a  message  m  we  first  pad  with  some  user-specific  data.  For  example  the  ciphertext 
becomes,  for  user  z, 

Ci  =  (z  •  2h  +  m)3  (mod  A^). 

However,  one  can  still  break  this  system  using  an  attack  due  to  Hastad.  Hastad’s  attack  is  related 
to  Coppersmith’s  Theorem  since  we  can  interpret  the  attack  scenario,  generalized  to  k  users  and 
public  encryption  exponent  e,  as  being  given  k  polynomials  of  degree  e 

gi(x)  =  (z  •  2h  +  x)e  —  ci,  1  <  z  <  k. 

Then  given  that  there  is  an  m  such  that 

9i(m)  =  0  (mod  A*), 

the  goal  is  to  recover  m.  We  can  assume  that  m  is  smaller  than  any  one  of  the  moduli  N{.  Setting 

N  =  N i  •  N 2  *  •  •  Nfc 


and  using  the  Chinese  Remainder  Theorem  we  can  find  t{  so  that 

k 

g(x)  =  T tl '  9i (x) 

i= 1 


and 

g(m)  =  0  (mod  N). 

Then,  since  g  has  degree  e  and  is  monic,  using  Theorem  5.10  we  can  recover  m  in  polynomial  time, 
as  long  as  we  have  at  least  as  many  ciphertexts/users  as  the  encryption  exponent  i.e.  k  >  e,  since 


m  <  min  A^  <  A1//fc  <  A1//e. 

i 


15.4.2.  Franklin— Reiter  Attack  and  Coppersmith’s  Generalization:  Now  suppose  we  have 
one  RSA  public  key  (A,  e)  owned  by  Alice.  The  Franklin-Reiter  attack  applies  to  the  following 
situation:  Bob  wishes  to  send  two  related  messages  mi  and  m2  to  Alice,  where  the  relation  is  given 
by  the  public  polynomial 

mi  =  /(7712)  (mod  A). 

We  shall  see  that,  given  c\  and  C2,  an  attacker  has  a  good  chance  of  determining  m\  and  m2  for 
any  small  encryption  exponent  e.  The  attack  is  particularly  simple  when 


/  =  a  •  x  +  b  and  e  =  3, 
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with  a  and  b  fixed  and  given  to  the  attacker.  The  attacker  knows  that  m2  is  a  root,  modulo  IV,  of 
the  two  polynomials 

Q 

gi{x)  =  x  -  c2, 

Q 

92{x)  =  f{x)  -Cl. 

So  the  linear  factor  x  —  m2  divides  both  g\{x)  and  g2(x). 

We  now  form  the  greatest  common  divisor  of  g\(pc)  and  g2(x).  Strictly  speaking  this  is  not 
possible  in  general  since  (Z/IV)Z[t]  is  not  a  Euclidean  ring,  but  if  the  Euclidean  algorithm  breaks 
down  then  we  would  find  a  factor  of  N  and  so  be  able  to  find  Alice’s  private  key  in  any  case.  One 
can  show  that  when  /  =  a  •  x  +  b  and  e  =  3,  the  resulting  gcd,  when  it  exists,  must  always  be  the 
linear  factor  x  —  m2 ,  and  so  the  attacker  can  always  find  m2  and  then  m\. 

Coppersmith  extended  the  attack  of  Franklin  and  Reiter  in  a  way  which  also  extends  the 
padding  result  from  Hastad’s  attack.  Suppose  before  sending  a  message  m  we  pad  it  with  some 
random  data.  So  for  example  if  N  is  an  n-bit  RSA  modulus  and  m  is  a  k- bit  message  then  we 
could  append  n  —  k  random  bits  to  either  the  top  or  bottom  of  the  message.  Say 

rri  =  2n~k  •  m  +  r 

where  r  is  some,  per  message,  random  number  of  length  n  —  k.  This  would  seem  to  be  a  good  idea  in 
any  case,  since  it  makes  the  RSA  function  randomized,  which  might  help  in  making  it  semantically 
secure.  However,  Coppersmith  showed  that  this  naive  padding  method  is  insecure. 

Suppose  Bob  sends  the  same  message  to  Alice  twice,  i.e.  we  have  ciphertexts  c\  and  C2  corre¬ 
sponding  to  the  messages 

m\  =  2n~k  •  m  +  tt, 
m2  =  2n~k  •  m  +  7*2, 

where  r  1,7*2  are  two  different  random  (n  —  fc)-bit  numbers.  The  attacker  sets  yo  =  V2  —  r\  and  is 
led  to  solve  the  simultaneous  equations 

gi(x,y)  =xe  -  ci, 

92{oc,y)  =  {x  +  y)e  -  c2. 

The  attacker  forms  the  resultant  h(y)  of  gi(x,y)  and  g2(x,y)  with  respect  to  x.  Now  yo  =  V2  —  v\ 
is  a  small  root  of  the  polynomial  h(y),  which  has  degree  e2.  Using  Coppersmith’s  Theorem  5.10 
the  attacker  recovers  r2  —  v\  and  then  recovers  m2  using  the  method  of  the  Franklin-Reiter  attack. 

Whilst  the  above  trivial  padding  scheme  is  therefore  insecure,  one  can  find  secure  padding 
schemes  for  the  RSA  encryption  algorithm.  We  shall  return  to  padding  schemes  for  RSA  in  Chap¬ 
ter  16. 

15.4.3.  Wiener’s  Attack  on  RSA:  We  have  mentioned  that  often  one  uses  a  small  public  RSA 
exponent  e  so  as  to  speed  up  the  public  key  operations  in  RSA.  Sometimes  we  have  applications 
where  it  is  more  important  to  have  a  fast  private  key  operation.  Hence,  one  could  be  tempted 
to  choose  a  small  value  of  the  private  exponent  d.  Clearly  this  will  lead  to  a  large  value  of  the 
encryption  exponent  e  and  we  cannot  choose  too  small  a  value  for  d,  otherwise  an  attacker  could 
find  d  using  exhaustive  search.  However,  it  turns  out  that  d  needs  to  be  at  least  the  size  of  |  •  iV1/4 
due  to  an  ingenious  attack  by  Wiener  which  uses  continued  fractions. 

Wiener’s  attack  uses  continued  fractions  as  follows.  We  assume  we  have  an  RSA  modulus 
N  =  p  •  q  with  q  <  p  <  2q.  In  addition  assume  that  the  attacker  knows  that  we  have  a  small 
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decryption  exponent  d  <  3  •  N 1^4.  The  encryption  exponent  e  is  given  to  the  attacker,  where  this 
exponent  satisfies 

e  •  d  =  1  (mod  4>), 

with 

We  also  assume  e  <  <f>,  since  this  holds  in  most  systems.  First  notice  that  this  means  there  is  an 
integer  k  such  that 

e  •  d  —  k  •  4>  =  1 . 

Hence,  we  have 


e  k 
T  ~  d 


1 


d  •  4> 


Now,  <f>  TV,  since 


So  we  should  have  that  A  is  a  close  approximation  to  k 


N  —  4>|  =  \p  +  q  —  1|  <3*  \/]V. 

k 

dm 


e  k 

e  •  d  —  N  •  k 

N  d 

d-N 

< 


e-d  —  k-Q  —  N-k  +  k-  <& 
dNV 

1  -  k  •  (TV  -  $) 
dNV 

3  •  k  •  a/W 
d-N 

3  •  k 


d'VN' 

Since  e  <  <h,  it  is  clear  that  we  have  k  <  d,  which  is  itself  less  than  |  •  AT 1//4  by  assumption.  Hence, 

1 


e 

~N  ~  d 


< 


2  -  d2' 


Since  gcd(A:,  d)  =  1  we  see  that  |  will  be  a  fraction  in  its  lowest  terms.  Hence,  the  fraction 

k 

d 

must  arise  as  one  of  the  convergents  of  the  continued  fraction  expansion  of 

e 

N' 

The  correct  one  can  be  detected  by  simply  testing  which  one  gives  a  d  which  satisfies 

( me)d  =  rn  (mod  N) 

for  some  random  value  of  m.  The  total  number  of  convergents  we  will  need  to  take  is  of  order 
O(logiV),  hence  the  above  gives  a  linear-time  algorithm  to  determine  the  private  exponent  when  it 
is  less  than  |  •  TV1/4. 

As  an  example  suppose  we  have  the  RSA  modulus 

N  =  9  449  868  410  449 

with  the  public  key 

e  =  6  792  605  526  025. 
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We  are  told  that  the  decryption  exponent  satisfies  d  <  3  •  N 1/4  «  584.  To  apply  Wiener’s  attack 
we  compute  the  continued  fraction  expansion  of  the  number 

e 

a  =  v 

and  check  each  denominator  of  a  convergent  to  see  whether  it  is  equal  to  the  private  key  d.  The 
convergents  of  the  continued  fraction  expansion  of  a  are  given  by 

2  3  5  18  23  409  1659 
i,3’  4’  7’  25’  32 ’  569’  2308’ 

Checking  each  denominator  in  turn  we  see  that  the  decryption  exponent  is  given  by 

d  =  569, 

which  is  the  denominator  of  the  seventh  convergent. 


15.4.4.  Extension  to  Wiener’s  Attack:  Boneh  and  Durfee,  using  an  analogue  of  the  bivariate 
case  of  Coppersmith’s  Theorem  5.10,  extended  Wiener’s  attack  to  the  case  where 

d  <  N0-292, 


using  a  heuristic  algorithm,  i.e.  the  range  of  “bad”  values  of  d  was  extended  further.  We  do  not 
go  into  the  details  but  show  how  Boneh  and  Durfee  proceed  to  a  problem  known  as  the  small 
inverse  problem.  Suppose  we  have  an  RSA  modulus  N ,  with  encryption  exponent  e  and  decryption 
exponent  d.  By  definition  there  is  an  integer  k  such  that 

,  k-$  , 

e  '  d  4 - —  1? 

where  4>  =  cj)(N ).  Expanding  the  definition  of  4>  we  find 


e  •  d  +  k  • 


tv 


1. 


We  set 


s 

A 


p  +  q 
2~ 
N+l 

2 


Then  finding  d,  where  d  is  small,  say  d  <  Ns ,  is  equivalent  to  finding  the  two  small  solutions  k  and 
5  to  the  following  congruence 


/(/c,  s)  =  k  •  (A  +  s)  =  1  (mod  e). 


To  see  that  k  and  5  are  small  relative  to  the  modulus  e  for  the  above  equation,  notice  that  e  ~  N 
since  d  is  small,  and  so 


n  c  nc;  2*d*(3  3  ’  d  ’  C  £ 

<  2  •  A0,5  e0,5  and  k  <  — - —  <  — — —  «  ed 


4> 


N 


We  can  interpret  this  problem  as  finding  an  integer  which  is  close  to  A  whose  inverse  is  small 
modulo  e.  This  is  called  the  small  inverse  problem.  Boneh  and  Durfee  show  that  this  problem 
has  a  solution  when  5  <  0.292,  hence  extending  Wiener’s  attack.  This  is  done  by  applying  the 
multivariate  analogue  of  Coppersmith’s  method  to  the  polynomial  f(k,s). 
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15.5.  Partial  Key  Exposure  Attacks  on  RSA 

Partial  key  exposure  is  related  to  the  following  question:  Suppose  in  some  cryptographic  scheme 
the  attacker  recovers  a  certain  set  of  bits  of  the  private  key,  can  the  attacker  use  this  to  recover  the 
whole  private  key?  In  other  words,  does  partial  exposure  of  the  key  result  in  a  total  break  of  the 
system?  We  shall  present  a  number  of  RSA  examples,  however  these  are  not  the  only  ones.  There 
are  a  number  of  results  related  to  partial  key  exposure  which  relate  to  other  schemes  such  as  DSA 
or  symmetric-key-based  systems. 

15.5.1.  Partial  Exposure  of  the  MSBs  of  the  RSA  Decryption  Exponent:  Somewhat 
surprisingly  for  RSA,  in  the  more  common  case  of  using  a  small  public  exponent  e,  one  can  trivially 
recover  half  of  the  bits  of  the  private  key  d,  namely  the  most  significant  ones,  as  follows.  Recall 
that  there  is  a  value  of  k  such  that  0  <  k  <  e  with 

e  •  d  —  k  •  ( A  —  (jp  +  q)  +  1)  =  1. 

Now  suppose  for  each  possible  value  of  z,  0  <  i  <  e,  the  attacker  computes 

di  =  [(i  •  A  +  l)/e_ . 

Then  we  have 

\dk  —  d\  <k-(p  +  q)/e<3-k-  V~N /e  <  3  •  y/~N. 

Hence,  d^  is  a  good  approximation  for  the  actual  value  of  d. 

Now  when  e  =  3  it  is  clear  that  with  high  probability  we  have  k  =  2  and  so  d 2  reveals  half  of 
the  most  significant  bits  of  d.  Unluckily  for  the  attack,  and  luckily  for  the  user,  there  is  no  known 
way  to  recover  the  rest  of  d  given  only  the  most  significant  bits. 

15.5.2.  Partial  Exposure  of  Some  Bits  of  the  RSA  Prime  Factors:  Suppose  our  n-bit  RSA 
modulus  A  is  given  by  p  •  q,  with  p  g,  and  that  the  attacker  has  found  the  n/4  least  significant 
bits  of  p.  Recall  that  p  is  only  around  n/2  bits  long  in  any  case,  so  this  means  the  attacker  is  given 
the  lower  half  of  all  the  bits  making  up  p.  We  write 

p  =  t0  •  2n/4  T  pq. 

We  then  have,  writing  q  =  yQ  -  2n/4  +  qo , 

A  =  Po  *  Qo  (mod  2n/4). 

Hence,  we  can  determine  the  value  of  qo.  We  now  write  down  the  polynomial 

p{x,  y)  =  (po  +  2"/4  •  x)  ■  (qo  +  2"/4  •  y) 

=  Po-qo  +  2n/4  ■  {po  ■  V  +  qo  ■  x)  +  2"/2  -x-y. 

Now  p(x,  y )  is  a  bivariate  polynomial  of  degree  two  which  has  known  small  solution  modulo  A, 
namely  (xo,?/o)  where  0  <  To,  1/0  A  2n/4  A1/4.  Hence,  using  the  heuristic  bivariate  extension 

of  Coppersmith’s  Theorem  5.10,  we  can  recover  xq  and  yo  in  polynomial  time  and  so  factor  the 
modulus  A.  A  similar  attack  applies  when  the  attacker  knows  the  n/4  most  significant  bits  of  p. 

15.5.3.  Partial  Exposure  of  the  LSBs  of  the  RSA  Decryption  Exponent:  We  now  suppose 
we  are  given,  for  small  public  exponent  e,  a  quarter  of  the  least  significant  bits  of  the  private 
exponent  d.  That  is  we  have  do  where 

d  =  do  +  2n/4  •  to 

where  0  <  To  <  23'n//4.  Recall  that  there  is  a  value  of  k  with  0  <  k  <  e  such  that 

e  •  d  —  k  •  (A  —  (p  +  q)  +  1)  =  1. 

We  then  have,  since  N  =  p  -  q, 

e  •  d  •  p  —  k  •  p  •  (A  —  p  +  l)  +  k  •  N  =  p. 
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If  we  set  po  =  p  (mod  2n/4)  then  we  have  the  equation 

(22)  e  •  do  •  po  —  k  •  po  •  (N  —  po  +  1)  +  k  •  N  —  po  =  0  (mod  2n/4). 

This  gives  us  the  following  algorithm  to  recover  the  whole  of  d.  For  each  value  of  k  less  than  e 
we  solve  equation  (22)  modulo  2n/4  for  po  Each  value  of  k  will  result  in  0(j)  possible  values  for 
po-  Using  each  of  these  values  of  po  in  turn,  we  apply  the  previous  technique  for  factoring  N  from 
Section  15.5.2.  One  such  value  of  po  will  be  the  correct  value  of  p  (mod  2n/4)  and  so  the  above 
factorization  algorithm  will  work  and  we  can  recover  the  value  of  d. 


15.6.  Fault  Analysis 


An  interesting  class  of  attacks  results  from  trying  to  induce  faults  within  a  cryptographic  system. 
We  shall  describe  this  area  in  relation  to  the  RSA  signature  algorithm  but  similar  attacks  can  be 
mounted  on  other  cryptographic  algorithms,  both  public  and  symmetric  key.  Imagine  we  have  a 
hardware  implementation  of  RSA,  in  a  smart  card  say.  On  input  of  some  message  m  the  chip  will 
sign  the  message  for  us,  using  some  internal  RSA  private  key.  The  attacker  wishes  to  determine 
the  private  key  hidden  within  the  smart  card.  To  do  this  the  attacker  can  try  to  make  the  card 
perform  some  of  the  calculation  incorrectly,  by  either  altering  the  card’s  environment  by  heating 
or  cooling  it  or  by  damaging  the  circuitry  of  the  card  in  some  way. 

An  interesting  case  is  when  the  card  uses  the  Chinese  Remainder  Theorem  to  perform  the 
signing  operation,  to  increase  efficiency,  as  explained  in  Chapter  6.  The  card  first  computes  the 
hash  of  the  message 

h  =  H  (m). 

Then  the  card  computes 

sp  =  hdp  (mod  p), 
sq  =  hdq  (mod  q), 

where  dp  =  d  (mod  p  —  1)  and  dq  =  d  (mod  q  —  1).  The  final  signature  is  produced  by  the  card 
from  sp  and  sq  via  the  Chinese  Remainder  Theorem  using 

•  u  =  (sq  —  sp)  •  T  (mod  q), 

•  5  =  sp  +  u  •  p, 

where  T  =  p1  (mod  q). 

Now  suppose  that  the  attacker  can  introduce  a  fault  into  the  computation  so  that  sp  is  computed 
incorrectly.  The  attacker  will  then  obtain  a  value  of  s  such  that 


se^h  (mod  p), 
se  =  h  (mod  q). 


Hence,  by  computing 


she  can  factor  the  modulus. 


q  =  gcd(se  —  h,  N ) 


Chapter  Summary 


•  RSA  is  the  most  popular  public  key  encryption  algorithm,  but  its  security  rests  on  the 
difficulty  of  the  RSA  problem  and  not  quite  on  the  difficulty  of  FACTOR. 
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•  Rabin  encryption  is  based  on  the  difficulty  of  extracting  square  roots  modulo  a  composite 
modulus.  Since  the  problems  SQRROOT  and  FACTOR  are  polynomial-time  equivalent 
this  means  that  Rabin  encryption  is  based  on  the  difficulty  of  FACTOR. 

•  Naive  RSA  is  not  IND-CPA  secure. 

•  The  RSA  encryption  algorithm  can  be  used  in  reverse  to  produce  a  public  key  signature 
scheme,  but  one  needs  to  combine  the  RSA  algorithm  with  a  hash  algorithm  to  obtain 
security  for  both  short  and  long  messages. 

•  Using  a  small  private  decryption  exponent  in  RSA  is  not  a  good  idea  due  to  Wiener’s 
attack. 

•  The  Hastad  and  Franklin-Reiter  attacks  imply  that  one  needs  to  be  careful  when  designing 
padding  schemes  for  RSA. 

•  Using  Coppersmith’s  Theorem  one  can  show  that  revealing  a  proportion  of  the  bits  of 
either  p,  q  or  d  in  RSA  can  lead  to  a  complete  break  of  the  system. 


Further  Reading 

Still  the  best  quick  introduction  to  the  concept  of  public  key  cryptography  can  be  found  in  the 
original  paper  of  Diffie  and  Heilman;  see  also  the  original  paper  on  RSA  encryption.  Our  treatment 
of  attacks  on  RSA  in  this  chapter  has  closely  followed  the  survey  article  by  Boneh. 

D.  Boneh.  Twenty  years  of  attacks  on  the  RSA  cryptosystem.  Notices  of  the  American  Mathemat¬ 
ical  Society  (AMS),  46,  203-213,  1999. 

W.  Diffie  and  M.  Heilman.  New  directions  in  cryptography.  IEEE  Trans,  on  Info.  Theory,  22, 
644-654,  1976. 

R.L.  Rivest,  A.  Shamir  and  L.M.  Adleman.  A  method  for  obtaining  digital  signatures  and  public-key 
cryptosystems.  Comm.  ACM,  21,  120-126,  1978. 


CHAPTER  16 


Public  Key  Encryption  and  Signature  Algorithms 


Chapter  Goals 

•  To  present  fully  secure  public  key  encryption  schemes  and  signature  schemes. 

•  To  show  how  the  random  oracle  model  can  be  used  to  prove  certain  encryption  and  sig¬ 
nature  schemes  are  secure. 

•  To  understand  how  the  RSA  algorithm  is  actually  used  in  practice,  and  sketch  a  proof  of 
RSA-OAEP. 

•  To  introduce  and  formalize  the  notion  of  hybrid  encryption,  via  KEMs  and  DEMs. 

•  To  present  two  efficient  KEMs,  namely  RSA-KEM  and  DHIES-KEM. 

•  To  explain  the  most  widely  used  signature  algorithms,  namely  variants  of  RSA  and  DSA. 

•  To  present  the  Cramer-Shoup  encryption  and  signature  schemes,  which  do  not  require  the 
use  of  the  random  oracle  model. 

16.1.  Passively  Secure  Public  Key  Encryption  Schemes 

In  this  section  we  present  three  basic  passively  secure  encryption  schemes,  namely  the  Goldwasser- 
Micali  encryption  scheme,  the  ElGamal  encryption  scheme,  and  the  Paillier  encryption  scheme. 

16.1.1.  Goldwasser— Micali  Encryption:  We  have  seen  that  RSA  is  not  semantically  secure 
even  against  a  passive  attack;  thus  it  would  be  nice  to  give  a  system  which  is  IND-CPA  secure 
and  is  based  on  some  factoring-like  assumption.  Historically  the  first  system  to  meet  these  goals 
was  one  by  Goldwasser  and  Micali.  The  scheme  is  not  used  in  real-life  applications,  due  to  its 
inefficiency,  however  its  simplicity  means  that  it  can  help  solidify  ideas  about  to  how  construct 
(and  prove  secure)  the  systems  we  do  use  in  real  life. 

The  security  of  the  Goldwasser-Micali  encryption  scheme  is  based  on  the  hardness  of  the 
QUADRES  problem,  namely  given  a  composite  integer  N  and  an  integer  e,  it  is  hard  to  test 
whether  a  is  a  quadratic  residue  or  not  without  knowledge  of  the  factors  of  N.  Let  us  recap,  from 
Chapter  2,  that  the  set  of  squares  in  (Z/AfZ)*  is  denoted  by 

Qn  =  {x 2  (mod  N)  :  x  e  (Z/ATZ)*}, 

and  Jn  denotes  the  set  of  elements  with  Jacobi  symbol  equal  to  plus  one,  i.e. 

JN  =  {ae  (Z/7VZ)*  :  =  l}  . 

The  set  of  pseudo-squares  is  the  difference  Jn  \  Qn •  Lor  an  RSA-like  modulus  N  =  p  -  q  the 
number  of  elements  in  Jn  is  equal  to  (p  —  1)  •  (q  —  l)/2,  whilst  the  number  of  elements  in  Qn  is 
(p  —  1)  •  (q  —  1) / 4.  The  QUADRES  problem  is  that  given  an  element  x  of  J/v,  it  is  hard  to  tell 
whether  x  G  Qn,  whilst  it  is  easy  to  tell  whether  x  G  Jn  or  not. 

We  can  now  explain  the  Goldwasser-Micali  encryption  system. 


©  Springer  International  Publishing  Switzerland  2016 

N.P.  Smart,  Cryptography  Made  Simple ,  Information  Security  and  Cryptography,  DOI  10.1 007/978-3-3 1 9-21 936-3_l  6 


313 


314 


16.  PUBLIC  KEY  ENCRYPTION  AND  SIGNATURE  ALGORITHMS 


Key  Generation:  As  a  private  key  we  take  two  large  prime  numbers  st  =  (p,  q)  and  then  compute 
the  public  modulus  TV  <—  p •  q,  and  an  integer  y  G  Jn\Qn •  The  public  key  is  set  to  be  pt  <—  (TV, y). 
The  value  of  y  is  computed  by  the  public  key  owner  by  first  computing  elements  yp  G  F*  and 
yq  G  F*  such  that 


Then  the  value  of  y  is  computed  from  yp  and  yq  via  the  Chinese  Remainder  Theorem.  A  value  of 
y  computed  in  this  way  clearly  does  not  he  in  Qn,  but  it  does  lie  in  Jjy  since 


(?)■(?) -(-i)-(-i)-1' 


Encryption:  The  Goldwasser-Micali  encryption  system  encrypts  one  bit  of  information  at  a  time. 
To  encrypt  the  bit  b, 

•  x  i —  ( z/tvz)* . 

•  c  <—  yb  •  x1  (mod  TV). 

The  ciphertext  is  then  the  value  of  c.  Notice  that  this  is  very  inefficient  since  a  single  bit  of  plaintext 
requires  log2  TV  bits  of  ciphertext  to  transmit  it. 


Decryption:  Notice  that  the  ciphertext  c  will  always  be  an  element  of  Jjy.  However,  if  the  message 
bit  b  is  zero  then  the  value  of  c  will  be  a  quadratic  residue,  otherwise  it  will  be  a  quadratic  non¬ 
residue.  So  all  the  decryptor  has  to  do  to  recover  the  message  is  determine  whether  c  is  a  quadratic 
residue  or  not  modulo  TV.  But  the  decryptor  is  assumed  to  know  the  factors  of  TV  and  so  can 
compute  the  Legendre  symbol 


If  this  Legendre  symbol  is  equal  to  plus  one  then  c  is  a  quadratic  residue  and  so  the  message  bit  is 
zero.  If  however  the  Legendre  symbol  is  equal  to  minus  one  then  c  is  not  a  quadratic  residue  and 
so  the  message  bit  is  one. 


It  is  now  relatively  straightforward  to  prove  that  the  Goldwasser-Micali  encryption  scheme  is 
IND-CPA  secure,  assuming  that  the  QUADRES  problem  is  hard  for  RSA  style  moduli  TV  of  size  v 
bits. 

Theorem  16.1.  Suppose  there  is  an  adversary  A  against  the  IND-CPA  security  of  the  Goldwasser- 
Micali  encryption  scheme  n  for  moduli  of  size  v  bits,  then  there  is  an  adversary  B  against  the 
QUADRES  problem  such  that 

Adv^D"CPA(/l)  =  2  •  Adv®UADRES(I?). 

Proof.  We  describe  the  algorithm  B  which  will  use  A  as  an  oracle.  To  see  this  in  pictures  see 
Figure  16.1.  Suppose  algorithm  B  is  given  TV  and  j  G  Jjy  and  is  asked  to  determine  whether 
j  G  Qn-  Algorithm  B  first  randomizes  j  to  form  y,  on  the  assumption  that  j  does  not  he  in  Qjy. 
Thus  algorithm  B  sets  y  <—  j  •  z2  (mod  TV),  for  some  2  (Z/TVZ)*.  The  public  key  pt  (TV,  y)  is 

then  passed  to  algorithm  A. 

Since  the  Goldwasser-Micali  system  only  encrypts  bits  we  can  assume  that  the  find  stage  of 
the  adversary  A  will  simply  output  the  two  messages 

mo  =  0  and  m  1  =  1. 

We  now  form  the  challenge  ciphertext 

c*  <—  yb  •  r2, 


16.1.  PASSIVELY  SECURE  PUBLIC  KEY  ENCRYPTION  SCHEMES 


315 


for  some  bit  b  <—  {0, 1}  and  some  random  r  <—  (Z/iVZ)*  chosen  by  algorithm  B.  We  now  pass  c*  to 
algorithm  A,  which  will  respond  with  its  guess  b'  for  the  bit  b.  If  b  =  b'  then  algorithm  B  returns 
that  j  is  a  quadratic  residue,  otherwise  it  returns  that  it  is  not. 

To  analyse  the  probabilities  we  notice  that  if  j  is  not  a  quadratic  residue  then  this  value  of  c* 
will  be  a  valid  encryption  of  the  message  m^.  So  if  j  is  a  quadratic  residue  then  algorithm  B  is 
presenting  a  valid  challenger  to  algorithm  A.  However,  if  j  is  not  a  quadratic  residue  then  this  is 
not  a  valid  encryption  of  anything  (since  the  public  key  is  not  even  valid).  Thus  we  have 


Adv®UADRES(£>) 


Pr [V  =  b\y  G  QN]  -  Pr[6'  =  b\y  G  JN\  Qn] 

Pr  [A  wins  for  a  valid  challenger] - 

2 


1 


•  Advn 


IND-CPA 


(V 


Figure  16.1.  How  B  interacts  with  A  in  Theorem  16.1 


□ 

Note  that  the  above  argument  says  nothing  about  whether  the  Goldwasser-Micali  encryption 
scheme  is  secure  against  adaptive  adversaries.  In  fact,  one  can  show  it  is  not  secure  against  such 
adversaries. 

Theorem  16.2.  The  Goldwasser-Micali  encryption  scheme  is  not  IND-CCA  secure. 

Proof.  Suppose  c*  is  the  target  ciphertext  and  we  want  to  determine  what  bit  b  is  encrypted 
by  c*.  Recall  that  c*  =  yb  •  x 2  (mod  N).  Now  the  rules  of  the  game  do  not  allow  us  to  ask  our 
decryption  oracle  to  decrypt  c*,  but  we  can  ask  our  oracle  to  decrypt  any  other  ciphertext.  We 
therefore  produce  the  ciphertext 

c  =  c*  •  z2  (mod  TV), 

for  some  random  value  2  E  (Z/iVZ)*.  It  is  easy  to  see  that  c  is  an  encryption  of  the  same  bit  b. 
Hence,  by  asking  our  oracle  to  decrypt  c  we  will  obtain  the  decryption  of  c*.  □ 

16.1.2.  ElGamal  Encryption:  The  Goldwasser-Micali  encryption  scheme  is  passively  secure, 
but  not  efficient.  What  we  really  want  is  a  simple  encryption  algorithm  which  is  efficient  and  which 
is  passively  secure1.  The  simplest  efficient  IND-CPA  secure  encryption  algorithm  is  the  ElGamal 
encryption  algorithm,  which  is  based  on  the  discrete  logarithm  problem.  In  the  following  we  shall 

1At  this  point  we  still  focus  on  passively  secure  systems.  Once  we  have  solved  this,  we  will  turn  our  focus  to 

actively  secure  schemes. 
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describe  the  finite  field  analogue  of  ElGamal  encryption;  we  leave  it  as  an  exercise  to  write  down 
the  elliptic  curve  variant. 

Domain  Parameters:  Unlike  the  RSA  algorithm,  in  ElGamal  encryption  there  are  some  public 
parameters  which  can  be  shared  by  a  number  of  users.  These  are  called  the  domain  parameters 
and  are  given  by 

•  pa  “large  prime” ,  by  which  we  mean  one  with  around  2048  bits,  such  that  p  —  1  is  divisible 
by  another  “medium  prime”  q  of  around  256  bits. 

•  g  an  element  of  F*  of  prime  order  q,  i.e.  g  =  (mod  p)  /  1  for  some  r  G  F*. 

The  domain  parameters  create  a  public  finite  abelian  group  G  of  prime  order  q  with  generator  g. 

Key  Generation:  Once  these  domain  parameters  have  been  fixed,  the  public  and  private  keys 
can  then  be  determined.  The  private  key  st  is  chosen  to  be  an  integer  x  <—  [0, . . . ,  q  —  1],  whilst 
the  public  key  is  given  by  pt  :=  h  <—  gx  (mod  p).  Notice  that,  whilst  each  user  in  RSA  needed  to 
generate  two  large  primes  to  set  up  their  key  pair  (which  is  a  costly  task),  for  ElGamal  encryption 
each  user  only  needs  to  generate  a  random  number  and  perform  a  modular  exponentiation  to 
generate  a  key  pair. 


Encryption:  Messages  are  assumed  to  be  elements  of  the  group  G.  To  encrypt  a  message  m  G  G 
we  do  the  following: 

•  k  <—  {0, . . . ,  q  —  1} 

•  Cl  <-  gk, 

•  C2  4 —  771  •  hk, 

•  Output  the  ciphertext,  c  <—  (ci,  C2)  G  G  x  G. 

Notice  that  since  each  message  has  a  different  ephemeral  key  k,  encrypting  the  same  message  twice 
will  produce  different  ciphertexts. 


Decryption:  To  decrypt  a  ciphertext  c  =  (ci,C2)  we  compute 


C2 


m 


hk 


m  •  g 


x-k 


Cl 


x 


,x-k 


a 


r  X’k 


=  rn. 


ElGamal  Example:  We  first  need  to  set  up  the  domain  parameters.  For  our  small  example  we 
choose  q  =  101,  p  =  809  and  g  =  256.  Note  that  q  divides  p  —  1  and  that  g  has  order  <7,  in  the 
multiplicative  group  of  integers  modulo  p.  As  a  public/private  key  pair  we  choose 

•  x  <—  68, 

•  h  gx  =  498. 

Now  suppose  we  wish  to  encrypt  the  message  m  =  100  to  the  user  with  the  above  ElGamal  public 
key. 

•  We  generate  a  random  ephemeral  key  k  89. 

•  Set  ci  gk  =  468. 

•  Set  C2  ^  m  •  hk  =  494. 

•  Output  the  ciphertext  as  c  =  (468,494). 

The  recipient  can  decrypt  our  ciphertext  by  computing 

C2  494 

—  = - ^  =  100. 

cU  46868 

This  last  value  is  computed  by  first  computing  46868,  taking  the  inverse  modulo  p  of  the  result  and 
then  multiplying  this  value  by  494. 


We  can  now  start  to  establish  basic  security  results  about  ElGamal  encryption  by  presenting 
two  results  in  the  passive  security  setting.  Our  first  one  says  that  if  the  Diffie-Hellman  problem 
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is  hard  then  ElGamal  is  OW-CPA,  whilst  the  second  one  says  that  if  the  Decision  Difhe-Hellman 
problem  is  hard  then  ElGamal  is  IND-CPA. 

Theorem  16.3.  If  A  is  an  adversary  against  the  OW-CPA  security  of  the  ElGamal  encryption 
scheme  n(G)  over  the  group  G ,  then  there  is  an  adversary  B  against  the  Diffie-Hellman  problem 
such  that 

Adv°w-)CPA(T  =  AdvgHP(i?). 

Proof.  The  algorithm  A  takes  as  input  a  public  key  h  and  a  target  ciphertext  c* *  =  (ci,C2),  and 
returns  the  underlying  plaintext.  We  will  show  how  to  use  this  algorithm  to  create  an  algorithm 
B  to  solve  the  DHP.  We  suppose  B  is  given  X  —  gx  and  Y  —  gy ,  and  is  asked  to  solve  the  Difhe- 
Hellman  problem,  i.e.  to  output  the  value  of  gx'v\  algorithm  B  then  proceeds  as  in  Algorithm  16.1. 


Algorithm  16.1:  Algorithm  to  solve  DHP  given  an  algorithm  to  break  the  one-way  security 
of  ElGamal _ 

As  input  we  have  X  =  gx  <E  G  and  Y  =  gy  E  G. 
h^X  =  gx. 
ci  <-  Y  =  gy. 

C2  i —  G. 

c*  v-  (Ci,C2). 
m  <—  A(c*,  h). 
return  c^jvn. 


In  words,  algorithm  B  first  sets  up  an  ElGamal  public  key  which  depends  on  the  input  to  the 
Difhe-Hellman  problem,  i.e.  we  set  h  <—  X  =  gx  (note  that  algorithm  B  does  not  know  what  the 
corresponding  private  key  is).  Now  we  write  down  the  target  “ciphertext”  c*  =  (ci,C2)  ,  where 

•  ci^Y  =  gy, 

•  C2  i —  G,  i.e.  a  random  element  of  the  group. 


This  ciphertext  is  sent  to  algorithm  A,  along  with  the  public  key  h.  Algorithm  A  will  then  output 
(if  successful)  the  underlying  plaintext,  We  then  solve  the  original  Difhe-Hellman  problem  by 
computing 


Z  e- 


C2 

m 


m  •  hy 


m 


hy  =  gx'y. 


□ 


We  can  use  a  similar  technique  to  prove  that  ElGamal  is  IND-CPA,  but  now  we  have  to  assume 
that  the  Decision  Difhe-Hellman  problem  is  hard.  Notice  that  to  obtain  a  stronger  notion  of 
security,  we  have  to  assume  a  weaker  problem  is  hard,  i.e.  make  a  stronger  assumption. 

Theorem  16.4.  If  A  is  an  adversary  against  the  IND-CPA  security  of  the  ElGamal  encryption 
scheme  n(G)  over  the  group  G,  then  there  is  an  adversary  B  against  the  Decision  Diffie-Hellman 
problem  such  that 

Adv^gNY  =  2  •  AdvgDH(I?). 

Proof.  As  usual  we  will  use  algorithm  A  as  a  subroutine  called  by  algorithm  B.  Our  algorithm  B 
for  solving  now  proceeds  as  in  Algorithm  16.2;  to  see  why  this  algorithm  solves  the  DDH  problem 
consider  the  following  argument. 

•  In  the  case  when  z  =  x  -  y  then  the  encryption  input  into  the  guess  stage  of  algorithm 
A  will  be  a  valid  encryption  of  Hence,  if  algorithm  A  can  really  break  the  semantic 
security  of  ElGamal  encryption  then  the  output  b'  will  be  correct  and  the  algorithm  will 
return  true. 
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Algorithm  16.2:  Algorithm  to  solve  DDH  given  an  algorithm  to  break  the  semantic  security 
of  ElGamal _ 

As  input  we  have  X  =  gx  ,Y  =  gy  and  Z  =  gz . 
h  <—  X  =  gx . 

(mo,  mi,  s)  <—  A(find,  h). 
d  <-  Y  =  gy. 

{0,1}. 

c2  <-  rnb  •  gz. 

C*  4r-  (ci,C2). 

b'  <—  A(guess,  c*,  s). 

if  b  =  b'  then  return  true, 
else  return  false. 


•  Now  suppose  that  z  /  x  •  y,  then  the  encryption  input  into  the  guess  stage  is  almost 
definitely  invalid,  i.e.  not  an  encryption  of  mi  or  ra2.  Hence,  the  output  b'  of  the  guess 
stage  will  be  independent  of  the  value  of  b.  Therefore  we  expect  the  above  algorithm  to 
return  true  or  false  with  equal  probability,  and  so  b'  =  b  with  probability  1/2. 

This  is  exactly  the  same  argument  that  we  had  in  the  proof  of  security  of  the  Goldwasser-Micali 
encryption  scheme,  and  so  the  relationship  between  the  advantages  will  follow  in  the  same  way.  □ 


Despite  the  above  positive  results  on  the  security  of  ElGamal  encryption  we  still  do  not  have  a 
scheme  which  is  secure  against  adaptive  chosen  ciphertext  attacks.  The  main  reason  for  this  is  that 
ElGamal  is  trivially  malleable.  Given  a  ciphertext  for  the  message  m, 

(ci,c2)  =  (gk,rn  ■  hk), 

one  can  then  create  a  valid  ciphertext  for  the  message  2  •  m  without  ever  knowing  m,  nor  the 
ephemeral  key  k,  nor  the  private  key  x.  In  particular  the  following  ciphertext  decrypts  to  2  •  m, 

(ci,2  •  c2)  =  (gk,  2  •  to  •  hk). 

One  can  use  this  malleability  property,  just  as  we  did  with  RSA  in  Lemma  15.4,  to  show  that 
ElGamal  encryption  is  not  OW-CCA  secure.  Notice  that  (1,2)  is  a  “trivial”  encryption  of  the 
number  2,  and  we  are  in  some  sense  combining  the  ciphertext  (1,2)  with  the  ciphertext  (ci,c2) 
to  produce  a  ciphertext  which  encrypts  2  •  m.  Any  two  ciphertexts  can  be  combined  in  this  way, 
to  produce  a  ciphertext  which  encrypts  the  product  of  the  underlying  plaintexts.  An  encryption 
scheme  with  this  property  is  called  multiplicatively  homomorphic. 

Lemma  16.5.  ElGamal  is  not  OW-CCA. 


Proof.  Suppose  the  message  the  adversary  wants  to  invert  is  c*  =  (ci,c2)  =  (gfc,m*  •  hk).  The 
adversary  then  creates  the  related  message  c—  (ci,  2  •  c2)  and  asks  her  decryption  oracle  to  decrypt 
c  to  give  the  message  m.  Then  Eve  computes 


m  2  •  c2  •  ci  x  2  •  m*  •  hk  •  g  x'k  2  •  m*  •  gx'k  •  g  x'k  2  •  m* 
~2  ~  2  “  2  “  2  “  2 


□ 


16.1.3.  Paillier  Encryption:  There  is  an  efficient  system,  due  to  Paillier,  based  on  the  difficulty 
of  factoring  large  integers,  which  can  be  shown  to  be  IND-CPA.  Paillier’s  scheme  has  a  number 
of  interesting  properties,  such  as  the  fact  that  it  is  additively  homomorphic  (which  means  it  has 
found  application  in  electronic  voting  applications). 
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Key  Generation:  We  first  pick  an  RSA  modulus  N  =  p  •  g,  but  instead  of  working  with  the 
multiplicative  group  (Z/ATZ)*  we  work  with  (Z/A2Z)*.  The  order  of  this  last  group  is  given  by 
(j)(N)  =  N  -  (p  —  1)  -  (q  —  1)  =  N  •  This  means,  by  Lagrange’s  Theorem,  that  for  all  a  with 

gcd(a,  N)  =  1  we  have 

aJV.(p— 1)  =  1  (mod  N2y 

The  private  key  for  Paillier’s  scheme  is  defined  to  be  an  integer  d  such  that 

d  =  1  (mod  A7"), 
d  =  0  (mod  (p  —  1)  •  (q  —  1)), 

such  a  value  of  d  can  be  found  by  the  Chinese  Remainder  Theorem.  The  public  key  pt  is  just  the 
integer  N,  and  the  private  key  st  is  the  integer  d. 

Encryption:  Messages  are  defined  to  be  elements  of  Z/7VZ.  To  encrypt  such  a  message  the 
encryptor  picks  an  integer  r  E  Z/Af2Z  and  computes  c  <—  (1  +  N)m  •  rN  (mod  N2). 

Decryption:  To  decrypt  one  first  computes 
t  <—  cd  (mod  N 2) 

=  (1  +  N)m'd  •  rd'N  (mod  N 2) 

=  (1  +  N)m'd  (mod  N2) 

—  1  +  m  •  d  •  N  (mod  N 2) 

=  1  +  ra  •  Af  (mod  N2) 

Then  to  recover  the  message  we  compute  R  <—  . 

Just  like  the  other  schemes  presented  so  far,  Paillier  encryption  is  malleable,  and  hence  it  cannot 
be  OW-CCA  secure.  However,  unlike  RSA,  Rabin  and  ElGamal,  the  malleability  is  additive.  In 
particular  given  two  ciphertexts,  c\  —  (1  +  A^)mi  •  r\  q  and  C2  ==  (1  +  N)1712  •  V2N ,  encrypting 
messages  mi  and  m2  we  can  easily  form  the  encryption  of  the  sum  of  the  plaintexts  by  computing 

Cl  •  C2  =  ((1  +  N)mi  ■  nN)  ■  ((1  +  7V)m2  •  r2N)  =  (1  +  A0mi+m2  •  (n  •  r2f  . 

Thus  we  say  that  Paillier  encryption  is  additively  homomorphic ;  this  should  be  compared  to  the 
multiplicatively  homomorphic  nature  of  RSA,  ElGamal  encryption  and  Rabin  encryption.  Finding 
an  IND-CPA  secure  encryption  scheme  which  is  simultaneously  both  additively  and  multiplicatively 
homomorphic  was  a  major  open  research  question  in  cryptography  for  over  thirty  years.  Such  a 
Fully  Homomorphic  Encryption  (FHE)  scheme,  was  given  in  2009  by  Gentry,  and  we  shall  return 
to  such  FHE  schemes  in  Chapter  17. 

The  Paillier  encryption  scheme  can  be  proved  to  be  IND-CPA  secure  assuming  the  following 
generalization  of  the  QUADRES  problem  is  secure.  Instead  of  detecting  whether  something  is  a 
square  modulo  AT,  the  adversary  needs  to  detect  whether  something  is  an  Nth  power  modulo  N2. 
We  present  the  problem  diagrammatically  in  Figure  16.2,  and  leave  it  to  the  reader  to  show  that 
Paillier  encryption  is  IND-CPA  under  this  assumption. 

16.2.  Random  Oracle  Model,  OAEP  and  the  Fujisaki— Okamoto  Transform 

We  have  now  seen  various  proofs  of  security  for  public  key  encryption  schemes,  yet  none  of  them 
prove  security  against  adversaries  which  can  make  chosen  ciphertext  queries.  Let  us  see  why  this 
might  be  a  problem  for  our  existing  proof  techniques.  To  recap,  we  are  given  an  algorithm  A  and 
the  proof  proceeds  by  trying  to  create  a  new  algorithm  B  which  uses  A  as  a  subroutine.  The  input 
to  B  is  the  hard  mathematical  problem  we  wish  to  solve  (e.g.  factoring),  whilst  the  input  to  A  is 
some  cryptographic  problem.  Since  we  have  a  public  key  algorithm  adversary  A  can  make  its  own 


since  d  =  0  (mod  (jp  —  1)  •  (q  —  1)) 


since  d  =  1  (mod  N ). 
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b  M0.1} 

p,q  <—  {p/2-bit  primes}  _ 

N  <—  p  •  q 
a  <r-  (Z/N2zy 

If  b  =  1  then  a  <—  aN  (mod  N 2) 

N,  a  - ► 

b'  - - 

Win  if  b  =  b'  _ 

Figure  16.2.  Security  game  to  define  the  Decision  Composite  Residuosity  Problem  (DCRP) 

encryption  queries,  as  soon  as  it  is  given  the  public  key.  Thus  when  A  is  a  CPA  adversary  there 
are  no  oracle  queries  which  have  to  be  answered  by  B.  The  usual  trick  in  these  proofs  is  that  the 
hard  mathematical  problem  is  based  on  some  secret  which  B  does  not  know,  and  that  this  secret 
becomes  the  secret  key  of  the  public  key  cryptosystem  which  A  is  trying  to  break. 

The  difficulty  arises  when  A  is  a  CCA  adversary;  in  this  case  A  is  allowed  to  call  a  decryption 
oracle  for  the  input  public  key.  The  algorithm  F>,  if  it  wants  to  use  algorithm  A  as  a  subroutine, 
needs  to  supply  the  answers  to  A’s  oracle  queries.  In  constructing  algorithm  B  we  now  have  a 
number  of  problems: 

•  Its  responses  must  appear  valid  (i.e.  valid  encryptions  should  decrypt),  otherwise  algo¬ 
rithm  A  would  notice  its  decryption  oracle  was  lying.  Hence,  algorithm  B  could  no  longer 
guarantee  that  algorithm  A  was  successful  with  non-negligible  probability. 

•  The  responses  of  the  decryption  oracle  should  be  consistent  with  the  probability  distri¬ 
butions  of  responses  that  A  expects  if  the  oracle  was  a  true  decryption  oracle.  Again, 
otherwise  A  would  notice. 

•  The  responses  of  the  decryption  oracle  should  be  consistent  across  all  the  calls  made  by 
the  adversary  A. 

•  Algorithm  B  needs  to  supply  these  answers  without  knowing  the  secret  key.  To  decrypt 
we  appear  to  need  to  know  the  secret  key,  which  is  exactly  what  B  does  not  have.  In  most 
cases  if  B  knew  the  secret  key  it  would  not  need  A  in  the  first  place! 

This  last  point  is  the  most  crucial  one.  We  are  essentially  asking  B  to  decrypt  a  ciphertext  without 
knowing  the  private  key,  but  this  is  meant  to  be  impossible  since  our  scheme  is  meant  to  be  secure. 

To  get  around  this  problem  it  has  become  common  practice  to  use  the  “random  oracle  model” , 
which  we  introduced  in  Chapter  11.  Recall  that  a  random  oracle  is  an  idealized  hash  function  which 
on  input  of  a  new  query  will  pick,  uniformly  at  random,  some  response  from  its  output  domain,  and 
which  if  asked  the  same  query  twice  will  always  return  the  same  response.  So  to  use  the  random 
oracle  model  we  need  to  include  a  hash  function  somewhere  in  the  processing  of  our  encryption 
and/or  decryption  operations. 

In  the  random  oracle  model  we  assume  our  adversary  A  makes  no  use  of  the  explicit  hash 
function  being  used  in  the  scheme  under  attack.  In  other  words  the  adversary  A  runs,  and  is 
successful,  even  if  we  replace  the  real  hash  function  by  a  random  oracle.  The  algorithm  B  responds 
to  the  decryption  oracle  queries  of  A  by  cheating  and  “cooking”  the  responses  of  the  random  oracle 
to  suit  his  own  needs. 

A  proof  in  the  random  oracle  model  is  an  even  more  relativized  proof  than  that  which  we 
considered  before.  Such  a  proof  says  that  assuming  some  problem  is  hard,  say  factoring,  then  an 
adversary  cannot  exist  which  makes  no  use  of  the  underlying  hash  function.  This  does  not  imply 
that  an  adversary  does  not  exist  which  uses  the  real  specific  hash  function  as  a  means  of  breaking 
the  cryptographic  system. 
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16.2.1.  RSA-OAEP:  Recall  that  the  raw  RSA  function  does  not  provide  a  semantically  secure 
encryption  scheme,  even  against  passive  adversaries.  To  make  a  system  which  is  secure  we  need 
either  to  add  redundancy  to  the  plaintext  before  encryption  or  to  add  some  other  form  of  redundancy 
to  the  ciphertext,  so  that  we  can  check  upon  decryption  whether  the  ciphertext  has  been  validly 
generated.  In  addition  the  padding  used  needs  to  be  random  so  as  to  make  a  non-deterministic 
encryption  algorithm.  Over  the  years  a  number  of  padding  systems  have  been  proposed.  However, 
many  of  the  older  ones  are  now  considered  weak. 

By  far  the  most  successful  padding  scheme  in  use  today  was  invented  by  Bellare  and  Rogaway 
and  is  called  OAEP  or  Optimized  Asymmetric  Encryption  Padding.  The  general  OAEP  method  is 
a  padding  scheme  which  can  be  used  with  any  function  which  is  a  trapdoor  one-way  permutation 
on  strings  of  k  bits  in  length.  When  used  with  the  RSA  trapdoor  one-way  permutation  we  need  to 
“tweak”  the  construction  a  little  since  RSA  does  not  act  as  a  permutation  on  bit  strings  of  length  k, 
it  acts  as  a  permutation  on  the  set  of  integers  modulo  N.  When  used  with  RSA  it  is  often  denoted 
RSA-OAEP. 

Originally  it  was  thought  that  OAEP  was  a  plaintext  aware  encryption  algorithm  in  the  random 
oracle  model,  irrespective  of  the  underlying  trapdoor  one-way  permutation,  but  this  claim  has  since 
been  shown  to  be  wrong.  However,  one  can  show  in  the  random  oracle  model  that  RSA-OAEP  is 
semantically  secure  against  adaptive  chosen  ciphertext  attacks. 


We  first  give  the  description  of  OAEP  in  general.  Let  /  be  any  k- bit  to  k- bit  trapdoor  one-way 
permutation.  Let  ko  and  k\  denote  numbers  such  that  a  work  effort  of  2k°  or  2kl  is  impossible  (e.g. 
ko,k\  >  128).  Put  n  =  k  —  ko  —  k\  and  let 

G  :  {0,  l}fco  — ►  {0,l}”+fel 

H  :  {0,l}n+fcl  — >  {0,0° 


be  hash  functions.  Strictly  speaking  H  is  a  hash  function  as  it  takes  bitstrings  and  compresses 
them  in  length,  whereas  G  is  a  function  more  like  a  key  derivation  function  in  that  it  expands  a 
short  bit  string  into  a  longer  one.  In  practice  for  RSA-OAEP  both  F  and  G  are  implemented  using 
hash  functions,  with  G  being  implemented  by  repeated  hashing  of  the  input  along  with  a  counter 
as  in  Section  14.6. 

Let  m  be  a  message  of  n  bits  in  length.  We  then  encrypt  using  the  function 


where 


c  E(m)  =  f({(m  II  00  ©  G(R)}  ||  {R 


H 


m 


0fcl)  ®G(i?))})  =f(A). 


m  II  0fcl  means  m  followed  by  k\  zero  bits, 
R  is  a  random  bit  string  of  length  ko, 
denotes  concatenation. 


One  can  view  OAEP  as  a  two- stage  Feistel  network,  as  Figure  16.3  demonstrates.  To  decrypt  we 
proceed  as  follows: 

•  Apply  the  trapdoor  to  /  to  recover  A  =  /_1(c)  =  {T\\{R  0  H ( T)}}. 

•  Compute  H  (  )  and  recover  R  from  R  0  H(  ). 

•  Compute  G(R)  and  recover  v  —  m  ||  0kl  from  T  —  m  II  0 

•  If  v  ends  in  k\  zeros  output  m,  otherwise  return  _L. 


When  applying  the  OAEP  transform  to  produce  an  RSA-based  version  we  take  k  =  8  •  [log8(A)J. 
We  then  produce  the  OAEP  block  A  as  above,  and  then  we  think  of  A  as  an  integer  less  than  N. 
It  is  to  this  integer  we  apply  the  RSA  encryption  function  f(A)  =  Ae  (mod  N ).  Upon  decrypting 
we  invert  the  function,  by  computing  /-1(c)  =  cd  (mod  N )  to  obtain  an  integer  A' .  We  then  check 
whether  A'  has  the  correct  number  of  zero  bits  to  the  left,  and  if  so  take  A  as  the  k  rightmost  bits 
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Figure  16.3.  OAEP  as  a  Feistel  network 


of  A'.  However,  one  needs  to  be  careful  about  how  one  reports  if  the  leftmost  bits  are  not  zero  as 
expected.  Such  an  error  should  not  be  distinguishable  from  the  OAEP  padding  decoding  returning 
a  A.  The  main  result  about  RSA-OAEP  is  the  following. 

Theorem  16.6.  In  the  random  oracle  model,  if  we  model  G  and  H  by  random  oracles  then  RSA- 
OAEP  is  IND-CCA  secure  if  the  RSA  problem  is  hard. 


Proof.  We  sketch  the  proof  and  leave  the  details  for  the  interested  reader  to  look  up.  We  first 
rewrite  the  RSA  function  /  as 


{o,  i}n+fci  x  {o,  i}ko  — >  ( z/Nzy 

(s,t)  i — »  (s||£)e  (mod  N), 


assuming  the  above-mentioned  padding  to  the  left  is  performed  on  encryption.  We  then  define 
RSA-OAEP  as  applying  the  above  function  /  to  the  inputs 


s  = 


(m||0fcl)  0  G(r)  and  t 


r  0  H(s). 


The  RSA  assumption  can  be  proved  to  be  equivalent  to  the  partial  one-wayness  of  the  function 
/,  in  the  sense  that  the  problem  of  recovering  5  from  f(s,t)  is  as  hard  as  recovering  (s,t)  from 
f(s,t).  So  for  the  rest  of  our  sketch  we  try  to  turn  an  adversary  A  for  breaking  RSA-OAEP  into 
an  algorithm  B  which  solves  the  partial  one-wayness  of  the  RSA  function.  In  particular  B  is  given 
c*  =  /(s*,t*),  for  some  fixed  RSA  modulus  N,  and  is  asked  to  compute  s*. 

Algorithm  A  works  in  the  random  oracle  model  and  so  it  is  assumed  to  only  access  the  functions 
H  and  G  via  external  calls,  with  the  answers  being  provided  by  the  environment.  In  addition  A 
expects  H  and  G  to  “act  like”  random  functions.  In  our  context  B  is  the  environment  for  A  and 
so  needs  to  supply  A  with  the  answers  to  its  calls  to  the  functions  H  and  G.  Thus  B  maintains  a 
list  of  queries  to  H  and  G  made  by  algorithm  A,  along  with  the  responses.  We  call  these  lists  the 
H- List  and  the  G-List  respectively. 

Algorithm  B  now  calls  algorithm  A,  which  will  make  a  series  of  calls  to  the  H  and  G  oracles 
(we  discuss  how  these  are  answered  below).  Eventually  A  will  make  a  call  to  its  O i_r  oracle  by 
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producing  two  messages  nn o  and  mi  of  n  bits  in  length.  A  bit  b  is  then  chosen  by  R,  and  B  now 
assumes  that  c*  is  the  encryption  of  The  ciphertext  c*  is  now  returned  to  A ,  as  the  response 
to  its  call  of  the  Olr  oracle.  Algorithm  A  then  continues  and  tries  to  guess  the  bit  b. 

The  oracle  queries  are  answered  by  B  as  follows: 

•  Query  G(y): 

For  any  query  (S,Sh)  in  the  H- List  one  checks  whether 

C*  =  f{5, 7©  M)- 

—  If  this  holds  then  we  have  partially  inverted  /  as  required  (i.e.  B  can  output  S  as  its 
solution).  We  can  still,  however,  continue  with  the  simulation  of  G  and  set 

G(y)  =  S  0  (ra5||0fcl). 

—  If  this  equality  does  not  hold  for  any  value  of  S  then  we  choose  y^  =  G(y)  uniformly 
at  random  from  the  codomain  of  G,  and  add  the  pair  (y,y q)  to  the  G-List. 

•  Query  H{5)\ 

A  random  value  5h  is  chosen  from  the  codomain  of  H ;  the  value  (A,  Sh)  is  added  to  the 
H- List.  We  also  check  whether  for  any  (y,  y h)  in  the  G-List  we  have 

C*  = 

if  so  we  have  managed  to  partially  invert  the  function  /  as  required,  and  we  output  the 
value  of  S. 

•  Query  decryption  of  c  : 

We  look  in  the  G-List  and  the  H- List  for  a  pair  (y,y c),  (5,  Sh)  such  that  if  we  set  a  = 
r  =  y  0  Sh  and  ^  =  y^  0  S,  then  c  =  /(a,  r)  and  the  k\  least  significant  bits  of  fi  are  equal 
to  zero.  If  this  is  the  case  then  we  return  the  plaintext  consisting  of  the  n  most  significant 
bits  of  /x,  otherwise  we  return  _L. 

Notice  that  if  a  ciphertext  which  was  generated  in  the  correct  way  (by  calling  G,  H  and  the 
encryption  algorithm)  is  then  passed  to  the  above  decryption  oracle,  we  will  obtain  the  original 
plaintext  back. 

We  have  to  show  that  the  above  decryption  oracle  is  able  to  “fool”  the  adversary  A  enough  of 
the  time.  In  other  words  when  the  oracle  is  passed  a  ciphertext  which  has  not  been  generated  by  a 
prior  call  to  the  necessary  G  and  iL,  we  need  to  show  that  it  produces  a  value  which  is  consistent 
with  the  running  of  the  adversary  A.  Finally  we  need  to  show  that  if  the  adversary  A  has  a  non- 
negligible  chance  of  breaking  the  semantic  security  of  RSA-OAEP  then  one  has  a  non- negligible 
probability  that  B  can  partially  invert  /. 

These  last  two  facts  are  proved  by  careful  analysis  of  the  probabilities  associated  with  a  number 
of  events.  Recall  that  B  assumes  that  c*  =  /(s*,t*)  is  an  encryption  of  Hence,  there  should 
exist  an  r*  which  satisfies 


r*  =  H(s*)@t*, 

G(r*)  =  5*  0  (rn^HO^1). 

One  first  shows  that  the  probability  of  the  decryption  simulator  failing  is  negligible.  Then  one 
shows  that  the  probability  that  5*  is  actually  asked  of  the  H  oracle  is  non- negligible,  as  long  as  the 
adversary  A  has  a  non-negligible  probability  of  finding  the  bit  b.  But  as  soon  as  s *  is  asked  of  H 
then  we  spot  this  and  can  therefore  break  the  partial  one-wayness  of  /. 

The  actual  technical  probability  arguments  are  rather  involved  and  we  refer  the  reader  to  the 
paper  of  Fujisaki,  Okamoto,  Pointcheval  and  Stern  where  the  full  proof  is  given.  □ 


324 


16.  PUBLIC  KEY  ENCRYPTION  AND  SIGNATURE  ALGORITHMS 


16.2.2.  The  Fujisaki— Okamoto  Transform:  We  end  this  section  with  another  generic  trans¬ 
form  which  can  be  applied  to  one  of  the  IND-CPA  secure  schemes  from  Section  16.1,  to  turn  it  into 
an  IND-CCA  secure  encryption  scheme.  Like  the  OAEP  transform,  the  transform  in  this  section  is 
secure  assuming  the  adversary  operates  in  the  random  oracle  model. 

Suppose  we  have  a  public  key  encryption  scheme  which  is  semantically  secure  against  chosen 
plaintext  attacks,  such  as  ElGamal  encryption.  Such  a  scheme  by  definition  needs  to  be  non- 
deterministic  hence  we  write  the  encryption  function  as 

E(m,  r), 


where  m  is  the  message  to  be  encrypted  and  r  is  the  random  input,  and  we  denote  the  decryption 
function  by  D(c).  Hence,  for  ElGamal  encryption  we  have 

E(m,  r )  =  (gr,  m  •  hr ). 


Fujisaki  and  Okamoto  showed  how  to  turn  such  a  scheme  into  one  which  is  IND-CCA  secure.  Their 
result  only  applies  in  the  random  oracle  model  and  works  by  showing  that  the  resulting  scheme  is 
plaintext  aware.  We  do  not  go  into  the  details  of  the  proof  at  all,  but  simply  give  the  transformation, 
which  is  both  simple  and  elegant. 

We  take  the  encryption  function  above  and  alter  it  by  setting 


E'(m,  r)  =  E (m\\r ,  H (m\\r)) 


where  H  is  a  hash  function.  The  decryption  algorithm  is  also  altered  in  that  we  first  compute 

rri  =  D(c) 


and  then  we  check  that 

c  =  E(rri , 

If  this  last  equation  holds  we  recover  m  from  m'  =  m||r;  if  the  equation  does  not  hold  then  we 
return  _L.  For  ElGamal  encryption  we  therefore  obtain  the  encryption  algorithm 

{gH^r\(m\\r)  ■  hH^ 


m\\r 


which  is  only  marginally  less  efficient  than  raw  ElGamal  encryption. 


16.3.  Hybrid  Ciphers 

Almost  always  public  key  schemes  are  used  only  to  transmit  a  short  per  message  secret,  such  as  a 
session  key.  This  is  because  public  key  schemes  are  too  inefficient  to  use  to  encrypt  vast  amounts 
of  data.  The  actual  data  is  then  encrypted  using  a  symmetric  cipher.  Such  an  approach  is  called 
a  hybrid  encryption  scheme. 

We  now  formalize  this  way  of  designing  a  public  key  encryption  scheme  with  a  hybrid  cipher, 
via  the  so-called  KEM/DEM  approach.  A  KEM  is  a  Key  Encapsulation  Mechanism,  which  is  the 
public  key  component  of  a  hybrid  cipher,  whilst  a  DEM  is  a  Data  Encapsulation  Mechanism,  which 
is  the  symmetric  component.  We  have  already  mentioned  DEMs  in  Chapters  11  and  13,  where  we 
constructed  DEMs  which  were  ot-IND-CCA  secure  as  symmetric  key  encryption  schemes,  namely 
symmetric  key  schemes  which  were  only  ever  designed  to  encrypt  a  single  message. 

We  will  present  a  security  model  for  KEMs  and  then  show,  without  using  random  oracles,  that 
a  suitably  secure  DEM  and  a  suitably  secure  KEM  can  be  combined  to  produce  an  IND-CCA  secure 
hybrid  cipher.  This  means  we  only  need  to  consider  the  symmetric  and  public  key  parts  separately, 
simplifying  our  design  considerably.  Finally,  we  show  how  a  KEM  can  be  constructed  in  the  random 
oracle  model  using  either  the  RSA  or  the  DLP  primitive.  The  resulting  KEMs  are  very  simple  to 
construct  and  very  natural,  so  we  see  the  simplification  obtained  by  utilizing  hybrid  encryption. 
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16.3.1.  Defining  a  Key  Encapsulation  Mechanism:  We  define  a  Key  Encapsulation  Mech¬ 
anism,  or  KEM,  to  be  a  mechanism  which  from  the  encryptor’s  side  takes  a  public  key  pt  and 
outputs  a  symmetric  key  fc  G  I  and  an  encapsulation  of  that  key  c  for  use  by  the  holder  of  the 
corresponding  private  key.  The  holder  of  the  private  key  st  can  then  take  the  encapsulation  c 
and  their  private  key,  and  then  recover  the  symmetric  key  k.  Thus  no  message  is  input  into  the 
encapsulation  mechanism.  We  therefore  have  three  algorithms  which  operate  as  follows: 

•  (pt,st)  KeyGen(K). 

•  (c,fc)  <-  Encappt(). 

•  k  <—  Decap5e(c). 

For  correctness  we  require,  for  all  pairs  (pt,5t)  output  by  KeyGen(IK),  that 


If  (c,  k)  Encappt()  then  Decapst(c)  =  k. 


The  security  definition  for  KEMs  is  based  on  the  security  definition  of  indistinguishability  of  en¬ 
cryptions  for  public  key  encryption  algorithms.  However,  we  now  require  that  the  key  output  by 
a  KEM  should  be  indistinguishable  from  a  random  key.  Thus  the  security  game  is  defined  via  the 
following  game. 


•  The  challenger  generates  a  random  key  ko  G  K  from  the  space  of  symmetric  keys  output 
by  the  KEM. 

•  The  challenger  calls  the  Encap  function  of  the  KEM  to  produce  a  valid  key  k\  E  K  and  its 
encapsulation  c*,  under  the  public  key  pE 

•  The  challenger  picks  a  bit  b  and  sends  to  the  adversary  the  values 

•  The  goal  of  the  adversary  is  to  decide  whether  b  =  0  or  1. 

The  advantage  of  the  adversary,  against  the  KEM  n,  is  defined  to  be 


Advn  d"cpa(A)  =  2  • 


Pr(A(pe,fcb,c*)  =  6) 


1 

2 


The  above  only  defines  the  security  in  the  passive  case;  to  define  security  under  adaptive  chosen 
ciphertext  attacks  one  needs  to  give  the  adversary  access  to  a  decapsulation  function.  This  decap¬ 
sulation  function  will  return  the  key  (or  the  invalid  encapsulation  symbol)  for  any  encapsulation  of 
the  adversary’s  choosing,  bar  the  target  encapsulation  c*.  In  such  a  situation  we  denote  the  advan¬ 
tage  by  Advj^1  d"cca(H),  and  say  the  KEM  is  secure  if  this  advantage  is  “small”  for  all  adversaries 
A.  We  describe  the  full  security  model  in  Figure  16.4. 


(pt,st)  <-  KeyGen(K) 
b  <—  {0, 1},  k\  <—  K 
(c*,fc0)  <-  Encappt() 

p£,  &Y,c*  - 

b'  - - 


Win  if  b'  =  b 


O 


Decap5€ 


If  c  =  c*  then  abort. 
k  Decap5e(c) 


Figure  16.4.  Security  game  IND-CCA  for  a  KEM 


16.3.2.  Generically  Constructing  Hybrid  Encryption:  The  idea  of  a  KEM/DEM  system 
is  that  one  takes  a  KEM  (defined  by  the  algorithms  KeyGen,  Encappe,  Decaps^  and  which  outputs 
symmetric  keys  from  the  space  K)  and  a  DEM  (defined  by  the  algorithms  and  with  key 

space  K),  and  then  uses  the  two  together  to  form  a  hybrid  cipher,  a.k.a.  a  public  key  encryption 
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scheme.  The  key  generation  method  of  the  public  key  scheme  is  simply  the  key  generation  method 
of  the  underlying  KEM.  To  encrypt  a  message  m  to  a  user  with  public/private  key  pair  (pt,st), 
one  performs  the  following  steps: 

•  (fc,ci)  <-  EncappB(). 

•  c2  <-  ek(m). 

•  c  <-  (ci,c2). 

The  recipient,  upon  receiving  the  pair  c—  (ci,  c2),  performs  the  following  steps  to  recover  m. 

•  k  <—  Decaps^(ci). 

•  If  k  =T  return  T. 

•  m  <—  dk(c2). 

•  Return  m. 

We  would  like  the  above  hybrid  cipher  to  meet  our  security  definition  for  public  key  encryption 
schemes,  namely  IND-CCA. 

Theorem  16.7.  The  hybrid  public  key  encryption  scheme  II  defined  above  is  IND-CCA  secure, 
assuming  the  KEM  scheme  IR  is  IND-CCA  secure  and  the  DEM  IR  is  ot- IND-CCA  secure.  In 
particular  if  A  is  an  adversary  against  the  IND-CCA  security  of  the  hybrid  public  key  encryption 
scheme  then  there  exist  adversaries  B  and  C  such  that 

Adv^D-CCA(V)  <  2  •  Adv^D-CCA(S)  +  Ad<"IND-CCA(C). 

Before  we  give  the  proof  notice  that  we  only  need  one-time  security  for  the  DEM,  as  each  symmetric 
encryption  key  output  by  the  KEM  is  only  used  once.  Thus  we  can  construct  the  DEM  from  much 
simpler  components. 


(pt,st)  ^KeyGen(K) 

p«  - 

b  -5—  {0, 1} 


b'  - - 

Win  if  b'  =  b 


A  m o,  mi  G  P 

Glr 

c  =  (ci,  c2)  G  C 

°dst 

M - 

(fco,c*)  EncappjO 
cl  <r-  eko(mb) 
c*  <-  (cl,  c*2) 

If  c  =  c*  then  abort. 
k  <-  Decapst(ci) 

If  k  =T  then  m  T 
else  m  dk(c2) 
m 


Figure  16.5.  IND-CCA  game  G o  for  our  hybrid  scheme 


Proof.  We  sketch  the  proof  and  direct  the  reader  to  the  paper  of  Cramer  and  Shoup  for  more 
details.  First  consider  Figure  16.5;  this  is  the  standard  IND-CCA  game  for  public  key  encryption, 
tailored  for  our  hybrid  encryption  scheme.  Let  us  call  this  game  Co-  We  now  modify  the  game 
which  A  is  playing;  instead  of  encrypting  c\  using  the  valid  key  ko  we  instead  use  a  new  random 
key  called  k\.  This  modified  game  we  call  G\  and  we  present  it  in  Figure  16.6. 

Game  G\  is  relatively  easy  to  analyse  so  we  will  do  this  first.  We  will  show  that  if  A  wins  in 
game  Gi,  then  we  can  construct  an  adversary  C  which  will  use  A  as  a  subroutine  to  break  the 
ot-IND-CCA  security  of  the  DEM  (ek,dk).  The  key  trick  we  use  is  that  the  c\  component  in  game 
Gi  is  unrelated  to  the  key  which  is  used  to  encrypt  We  present  algorithm  C  in  Algorithm 
16.3.  Notice  that  in  Game  G\  when  A  makes  a  decryption  query  for  (ci,c2),  it  is  validly  decrypted 
by  C,  as  long  as  c\  c\.  When  this  last  condition  holds,  algorithm  C  uses  its  own  decryption 
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(pt,st)  <—  KeyGen(IK) 

P*  - 

b  <—  {0, 1},  k\  <—K. 


b'  - - 

Win  if  b'  =  b 


A  mo,  mi  G  P 

0\R 

c  =  (ci,  C2)  G  C 

°dBt  

◄ - 

(ko,  c\)  4-  EncapP{() 
c*2  <r-  ekl  (mb) 
c*  4-  ( cl,c*2 ) 

If  c  =  c*  then  abort. 

If  ci  =  c\  then  k  <—  k\ 
else  k  <—  Decap5i(ci) 
If  k  =_L  then  rri  <—  _L 
else  m  <—  d&(c2) 
m 


Figure  16.6.  Game  G\ 


oracle  to  return  the  decryption  of  C2.  Note  that  B7s  target  ciphertext  c\  is  never  passed  to  its  own 
decryption  oracle,  unless  B  aborts  because  A  made  an  invalid  query.  In  addition,  note  that  the 
0i_R  oracle  of  C  is  only  called  once,  as  is  required  in  the  one-time  security  of  a  DEM.  Finally,  note 
that  A  winning  (or  losing)  game  G\  directly  corresponds  to  C  winning  (or  losing)  game,  thus 


Adv°t-|ND-CCA 


(C)  =  2  •  Pr[G  wins 


1 


2  • 


Pr[A  wins  in  game  Gi] 


1 

2 


Algorithm  16.3:  Algorithm  C 

(pt,st)  <—  KeyGen(IK). 

Call  A  with  input  the  public  key  pp 

/*  A7 s  (9|_r  Oracle  Queries  */ 

A  makes  an  G|_r  query  with  messages  mo, mi. 

C  passes  mo,  mi  to  its  own  0|_r  oracle  to  obtain  G>. 
cl  <-  Encappe(). 
c*  «-  {cl, 4). 
return  c*. 

/*  A7 s  Odk  Oracle  Queries  */ 

A  makes  an  Odk  query  with  ciphertext  c—  (co,ci). 

if  co  /  co*  then  k  <—  Decaps*(co),  m  <—  dk(c\). 

else  if  ci  ^  ci*  then  C  passes  ci  to  its  Odk  oracle  to  obtain  m. 

else  abort, 
return  m. 

/*  A7 s  Response  */ 

When  A  returns  b' . 

return  b' . 


We  now  turn  to  what  is  the  most  complex  step.  We  want  to  bound  the  probability 

Pr[A  wins  in  game  Go]  —  Pr[A  wins  in  game  Gi]  . 
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We  do  this  by  presenting  an  algorithm  B  which  uses  A  to  break  the  IND-CCA  security  of  the  KEM. 
It  does  this  as  follows:  Algorithm  B  does  not  know  whether  the  key/encapsulation  pair  given  to 
it  is  a  real  encapsulation  of  the  key,  or  a  fake  one.  It  constructs  an  environment  for  A  to  play  in, 
where  in  the  first  case  A  is  playing  game  G o,  whereas  in  the  second  it  is  playing  game  G\.  Hence, 
any  advantage  A  has  in  distinguishing  the  two  games  it  is  playing  in,  can  be  exploited  by  B  to 
break  the  KEM.  We  give  the  algorithm  for  B  in  Algorithm  16.4. 


Algorithm  16.4:  Algorithm  B 

B  has  as  input  pE 

Call  A  with  input  the  public  key  pE 

/*  A’s  (9|_r  Oracle  Queries  */ 

A  makes  an  G|_r  query  with  messages  (mo, mi). 

B  calls  its  own  G|_r  oracle  to  obtain  (c^,  k*). 

{0,1}. 

cl  <-  ek*(mb). 
c*  <-  {cl,  4). 

return  c* 

/*  A’s  Odk  Oracle  Queries  */ 

A  makes  an  Odk  query  with  ciphertext  c—  (co,ci). 

if  co  /  co*  then 

B  calls  its  OoecapS£  oracle  on  co  to  obtain  k. 

_  m<r-  <4(ci). 

else  if  ci  ^  ci*  then  m  <—  dk*  (ci). 

else  abort, 
return  m. 

/*  A’s  Response  */ 

When  A  returns  b' . 

Ci  i —  0. 

if  b'  7^  b  then  a  <—  1. 
return  a. 


So  algorithm  B  outputs  zero  if  it  thinks  A  is  playing  in  game  Go,  i.e.  if  /c*  is  the  actual  key 
underlying  the  encapsulation  c^,  whereas  B  will  output  one  if  it  thinks  A  is  playing  in  game  Gi, 
i.e.  if  k *  is  just  some  random  key.  So  we  have 


2  •  Advn  d_cca(R)  =  2  •  Pr[a  =  0  |  A  in  game  Go]  —  Pr[a  =  0  |  A  in  game  Gi 


=  2 


Pr[A  wins  in  game  Go]  —  Pr[A  wins  in  game  Gi 


We  then  have  that 


Adv 


IND-CCA 

n 


0  =  2 


2  • 


Pr[A  wins  in  game  Go] 
Pr[A  wins  in  game  Go 


1 

2 

1 

2 


Pr[A  wins  in  game  Gi]  +  Pr[A  wins  in  game  Gi 


adding  zero 
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<  2 


Pt[A  wins  in  game  Go]  —  Pt[A  wins  in  game  G\ 


+  2- 


~Py[A  wins  in  game  G\ 


1 

2 


=  2 

=  2 


Py[A  wins  in  game  Go]  —  Py[A  wins  in  game  G\ 

+  Adv£-|ND-CCA(C) 


Adv 


IND-CCA 

ni 


(B)  +  Adv^' 


ot-IND-CCA 


(C). 


triangle  Inequality 


□ 


16.4.  Constructing  KEMs 

As  previously  mentioned,  KEMs  are  simpler  to  design  than  public  key  encryption  algorithms.  In 
this  section  we  first  look  at  RSA-KEM,  whose  construction  and  proof  should  be  compared  to  that 
of  RSA-OAEP.  Then  we  turn  to  DHIES-KEM  which  should  be  compared  to  the  ElGamal  variant 
of  the  scheme  based  on  the  Fujisaki-Okamoto  transform. 

16.4.1.  RSA-KEM:  Let  N  denote  an  RSA  modulus,  i.e.  a  product  of  two  primes  p  and  q  of 
roughly  the  same  size.  Let  e  denote  an  RSA  public  exponent  and  d  an  RSA  private  exponent. 
We  let  denote  the  RSA  function,  i.e.  the  function  that  maps  an  integer  x  modulo  N  to 

the  number  xe  (mod  N ).  The  important  point  for  RSA-KEM  is  that  this  is  a  trapdoor  one-way 
function;  only  the  holder  of  the  secret  trapdoor  d  should  be  able  to  invert  it.  This  is  summarized 
in  the  RSA  problem,  which  is  the  problem  of  given  an  integer  y  modulo  N  to  recover  the  value  of 
x  such  that  fN,e(x )  —  U- 

We  define  RSA-KEM  by  taking  a  cryptographic  key  derivation  function  H  which  takes  integers 
modulo  N  and  maps  them  to  symmetric  keys  of  the  size  required  by  the  user  of  the  KEM  (i.e.  the 
key  size  of  the  DEM).  In  our  security  model  we  will  assume  that  H  behaves  like  a  random  oracle. 
Encapsulation  then  works  as  follows: 

•  x  <—  {1, . . . ,  N  —  1}. 

•  C  <-  fN,e(X )• 

•  k  <—  H(x). 

•  Output  (k,c). 

Since  the  person  with  the  private  key  can  invert  the  function  /w,e?  decapsulation  is  easily  performed 
via 

•  X  <-  fp(c). 

•  k  <—  H(x). 

•  Output  k. 

There  is  no  notion  of  invalid  ciphertexts,  and  this  is  simpler  in  comparison  to  RSA-OAEP.  The 
construction  actually  works  for  any  trapdoor  one-way  function.  We  now  only  need  to  show  that 
this  simple  construction  meets  our  definition  of  a  secure  KEM. 

Theorem  16.8.  In  the  random  oracle  model  RSA-KEM  is  an  IND-CCA  secure  KEM ,  assuming 
the  RSA  problem  is  hard.  In  particular  given  an  adversary  A  against  the  IND-CCA  property  of  the 
RSA-KEM  scheme  II,  for  moduli  of  size  v  bits ,  which  treats  H  as  a  random  oracle,  then  there  is 
an  adversary  B  against  the  RSA  problem  for  integers  of  size  v  such  that 

Adv^D“CCA(A)  <  Ad 4?sa(B). 

Proof.  Since  A  works  in  the  random  oracle  model,  we  model  the  function  H  in  the  proof  as  a 
random  oracle.  Thus  algorithm  B  keeps  a  list  of  triples  (z,c,  h),  which  we  will  call  the  R-List,  of 
queries  to  H ,  which  is  initially  set  to  be  empty.  The  value  z  denotes  the  query  to  the  function  H, 


330 


16.  PUBLIC  KEY  ENCRYPTION  AND  SIGNATURE  ALGORITHMS 


the  value  h  the  output  and  the  value  c  denotes  the  output  of  /jv,e  on  z.  Algorithm  B  has  as  input 
a  value  y  for  which  it  is  trying  to  invert  the  function  /w,e-  Algorithm  B  passes  the  values  N,  e  to  A 
as  the  public  key  of  the  KEM  which  A  is  trying  to  attack.  To  generate  the  challenge  encapsulation, 
the  challenger  generates  a  symmetric  key  k  at  random  and  takes  as  the  challenge  encapsulation  the 
value  c*  <—  y  of  the  RSA  function  for  which  B  is  trying  to  find  the  preimage.  It  then  passes  k  and 
c*  to  A. 

The  adversary  is  allowed  to  make  queries  of  H  for  values  z.  If  this  query  on  z  has  been  made 
before,  then  B  uses  its  H- List  to  respond  as  required.  If  there  is  a  value  on  the  list  of  the  form 
(_L,c,  h)  with  fN,e{z)  =  c  then  B  replaces  this  value  with  (z,c,  h)  and  responds  with  h.  Otherwise 
B  generates  a  new  random  value  of  h,  adds  the  triple  (z,  fN,e(z),  h)  to  the  list  and  responds  with 
h. 

The  adversary  can  also  make  decapsulation  queries  on  an  encapsulation  c.  If  there  is  a  value 
(z,c,  h),  for  some  c  and  /i,  on  the  ET-List  it  responds  with  h.  Otherwise,  it  generates  h  at  random, 
places  the  triple  (_L,  c,  h)  on  the  list  and  responds  with  h. 

Since  A  is  running  in  the  random  oracle  model,  the  only  way  that  A  can  have  any  success  in 
the  game  is  by  querying  H  on  the  preimage  of  y.  Thus  if  A  is  successful  then  the  preimage  of  y  will 
exist  on  the  list  of  triples  kept  by  algorithm  B.  Hence,  when  A  terminates  B  searches  its  H- List 
for  a  triple  of  the  form  (x,  y,  h)  and  if  there  is  one  it  outputs  x  as  the  preimage  of  y. 

In  summary  algorithm  B  is  presented  in  Algorithm  16.5.  It  is  easy  to  see  that  the  calls  to  H 
and  the  calls  to  0Decapr*  are  answered  by  B  in  a  consistent  way,  due  to  algorithm  Z>’s  ability  to 
ensure  the  required  behaviour  of  the  random  oracle  responses.  □ 


16.4.2.  The  DHIES  Encryption  Scheme:  The  DHIES  encryption  scheme  is  the  instantiation 
of  our  hybrid  encryption  paradigm,  with  the  DHIES-KEM  and  the  data  encapsulation  mechanism 
being  the  Encrypt-then-MAC  instantiation.  The  scheme  was  designed  by  Abdalla,  Bellare  and 
Rogaway  and  was  originally  called  DHAES,  for  Difhe-Hellman  Augmented  Encryption  Scheme. 
However,  this  caused  confusion  with  the  Advanced  Encryption  Standard.  So  the  name  was  changed 
to  DHIES,  for  Difhe-Hellman  Integrated  Encryption  Scheme.  When  used  with  elliptic  curves  it  is 
called  ECIES.  To  define  the  scheme  all  we  need  to  do  is  present  the  DHIES-KEM  component,  as 
the  rest  follows  from  our  prior  discussions. 

Key  Generation:  The  domain  parameters  are  a  cyclic  finite  abelian  group  G  of  prime  order  g, 
a  generator  g  and  the  key  space  K  for  the  data  encapsulation  mechanism  to  be  used.  We  require 
a  key  derivation  function  H  with  codomain  equal  to  K,  which  again  we  will  model  as  a  random 
oracle.  To  generate  a  public/private  key  pair  we  generate  a  random  x  <—  Z/gZ  and  compute  the 
public  key  h  <—  gx . 


Encapsulation:  Encapsulation  proceeds  as  follows: 

•  u  <—  Z/ gZ. 

•  v  <—  hu. 

•  c  <—  gu . 

•  k  <—  H(v\\c). 


Decapsulation:  To  decapsulate  the  KEM  one  takes  c  and  using  the  private  key  one  computes 


-,X 


V  C 

k  <—  H(v\\c). 


However,  to  prove  this  KEM  secure  we  need  to  introduce  a  new  problem  called  the  Gap  Difhe- 
Hellman  problem.  This  problem  assumes  that  the  Difhe-Hellman  problem  is  hard  even  assuming 
that  the  adversary  has  an  oracle  to  solve  the  Decision  Difhe-Hellman  problem.  In  other  words,  we 
are  given  ga  and  gh  and  an  oracle  O ddh  which  on  input  of  (gx,  gy ,  gz )  will  say  whether  z  =  x-y.  We 
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Algorithm  16.5:  Algorithm  A  using  an  IND-CCA  adversary  A  against  RSA-KEM  to  solve 
the  RSA  problem 

B  has  input  A,  e,  y  and  is  asked  to  find  x  such  that  xe  (mod  A )  =  y. 

(p  t)<-(N,e). 
k  <—  K,  c*  <—  y. 

A-List  <—  0. 

Call  A  with  input  pt,  k,  c*. 

/*  A’s  H  Oracle  Queries  */ 

A  makes  a  random  oracle  query  with  input  z. 

if  3  (z,c,h)  G  A-List  then  return  h. 

if  3  (A,  c,  h )  G  A-List  with  ze  (mod  A )  =  c  then 

A-List  <—  ^A-List  U  {(z,  c,  h )}^  \  {(A,  c,  h)}. 

else 

h  <—  K. 

c  A  (mod  A). 

A-List  A-List  U  {(z,  c,  A)}. 

return  h 

/*  A’s  ODecapsfi  Oracle  Queries  */ 

A  makes  an  0Decapst  query  with  ciphertext  c. 
if  3  (•,  c,  h)  G  A-List  then  return  h 
h  K. 

A-List  A-List  U  {(A,  c,  h)}. 

return  h. 

/*  A’s  Response  */ 

When  A  returns  b' . 

if  3  (  x,  c*,  •)  G  A-List  then  return  x. 
return  A. 


then  wish  to  output  ga'b.  We  define  Adv^ap_DHP(A)  as  the  probability  that  the  algorithm  A  wins  the 
Difhe-Hellman  problem  game,  given  access  to  an  oracle  which  solves  the  Decision  Difhe-Hellman 
problem.  It  is  believed  that  this  problem  is  as  hard  as  the  standard  Difhe-Hellman  problem.  Indeed 
there  are  some  groups  in  which  the  Decision  Difhe-Hellman  problem  is  easy  and  the  computational 
Difhe-Hellman  problem  is  believed  to  be  hard. 

We  can  now  prove  that  the  DHIES-KEM  is  secure.  Before  stating  and  proving  the  theorem 
we  pause  to  point  out  why  we  need  the  Gap  Difhe-Hellman  problem.  In  the  proof  of  security  of 
RSA-KEM,  Theorem  16.8,  the  algorithm  A’s  simulation  of  a  valid  attack  environment  to  algorithm 
A  was  perfect.  In  other  words  A  could  not  notice  it  was  playing  against  someone  trying  to  solve 
the  RSA  problem,  and  not  a  genuine  encryption  system.  Algorithm  A  did  this  by  “cooking”  the 
values  output  by  the  random  oracle,  by  computing  the  RSA  function  in  a  forwards  direction.  In  the 
simulation  in  the  theorem  below,  algorithm  A  still  proceeds  with  much  the  same  strategy.  However, 
to  do  a  similar  cooking  of  the  random  oracle  algorithm  A  needs  to  be  able  to  distinguish  Difhe- 
Hellman  tuples,  from  non- Difhe-Hellman  tuples.  Thus  algorithm  A  needs  a  mechanism  to  do  this. 
Hence,  algorithm  A  needs  access  to  a  O ddh  oracle,  and  so  A  does  not  solve  the  Difhe-Hellman 
problem,  but  the  Gap  Difhe-Hellman  problem. 
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Theorem  16.9.  In  the  random  oracle  model  and  assuming  the  Gap  Diffie- Heilman  problem  is 
hard,  there  exists  no  adversary  which  breaks  DHIES-KEM.  In  particular  if  A  is  an  adversary  which 
breaks  the  IND-CCA  security  of  the  DHIES-KEM  scheme  II  for  the  group  G,  which  treats  H  as  a 
random  oracle,  then  there  is  an  adversary  B  against  the  Gap  Diffie-Hellman  problem  for  the  group 
G  with 

Advir-CCA(T  =  Ad  v£ap-DHP(B). 

Proof.  We  provide  a  sketch  of  the  proof  by  simply  giving  algorithm  B  in  Algorithm  16.6,  and 
presenting  some  comments.  Notice  that  the  public  key  is  ga  and  the  target  encapsulation  is  c*  =  gb. 
Hence,  the  Diffie-Hellman  value  which  needs  to  be  passed  to  the  key  derivation  function  H  to  obtain 
the  target  encapsulated  key  is  n*  =  ga'h .  Since  H  is  a  random  oracle  the  only  way  A  can  find  out  any 
information  about  the  encapsulated  key  is  to  make  the  query  H(v*\\c*).  Thus  the  Diffie-Hellman 
value,  if  A  is  successful,  will  end  up  on  7T s  H- List.  When  looking  at  Algorithm  16.6  you  should 
compare  it  with  Algorithm  16.5. 


Algorithm  16.6:  Algorithm  B  using  an  IND-CCA  adversary  A  against  DHIES-KEM  to  solve 
the  Gap  Diffie-Hellman  problem 

B  has  input  A  =  ga,  B  =  gb  and  is  asked  to  find  C  =  ga'b . 

(pt)  =  h<-  A. 
k  <-  K,  c*  G-  B. 

17-List  <—  0. 

Call  A  with  input  pt,  k,  c*. 


c. 


/*  A’s  H  Oracle  Queries  */ 

A  makes  a  random  oracle  query  with  input  z 
if  3  (z,c,h)  G  17-List  then  return  h. 
if  3(_L,c,  h)  G  17-List  such  that  OoD\-\{g,  A,  c,  z)=true  then 

17-List  <—  ^17-List  U  {(z,  c,  h )}^  \  {(_L,  c,  h)}. 


else 

h  <—  K. 

17-List  <—  17-List  U  {(z,  c,  h)}. 


return  h. 


/*  A’s  O Decaprf  Oracle  Queries  */ 

A  makes  an  O Decapst  query  with  ciphertext  c. 

if  3  (;C,h)  G  17-List  then  return  h. 
h  K. 

17-List  17-List  U  {(_L,  c,  h)}. 

return  h. 


/*  A’s  Response  */ 

When  A  returns  b' . 

if  3  (C,  c*,  •)  G  17-List  then  return  C 
return  T. 


□ 
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16.5.  Secure  Digital  Signatures 

We  already  saw  in  Chapter  15  how  the  combination  of  hash  functions  and  the  RSA  function  could 
be  used  to  produce  digital  signatures.  In  particular  we  presented  the  RSA-FDH  signature  scheme. 
In  this  section  we  first  prove  that  this  scheme  is  secure  in  the  random  oracle  model.  However,  RSA- 
FDH  requires  a  hash  function  with  codomain  the  RSA  group.  Since  such  hash  functions  are  not 
“natural”  we  also  present  the  RSA-PSS  signature  scheme  which  does  not  have  this  restriction,  and 
which  is  also  secure  in  the  random  oracle  model.  Having  presented  these  two  variants  of  signatures 
based  on  the  RSA  problem,  we  then  turn  to  discussing  signature  schemes  based  on  the  discrete 
logarithm  problem. 


16.5.1.  RSA-FDH:  In  Chapter  15  we  presented  the  RSA-FDH  signature  scheme.  The  proof  we 
outline  below  for  RSA-FDH  bears  much  in  common  with  the  proof  for  RSA-KEM  above,  especially 
in  the  way  the  hash  function  is  modelled  as  a  random  oracle.  For  RSA-FDH  we  assume  a  hash 
function 

H  :  {0,1}*  — >  (Z/1VZ)*, 

where  N  is  the  RSA  modulus  of  the  public  key.  Again  such  hash  functions  are  hard  to  construct  in 
practice,  but  if  we  assume  they  can  exist  and  we  model  them  using  a  random  oracle  then  we  can 
prove  the  RSA-FDH  signature  algorithm  is  secure. 

As  above  let  /w,e  denote  the  function 

r  ( z/Nzy  — >  (z/Nzy 

f N ,e  '  \  v  e 

X  l - »  X  . 

The  RSA-FDH  signature  algorithm  signs  a  message  m  as  follows 

s  H(m)d  (mod  N)  =  fpe(H(m)), 

where  the  private  exponent  is  d.  Verification  of  a  signature  is  performed  by  checking  whether 

fN,e(s)  =  se  (mod  N)  =  H{m). 

Recall  that  the  RSA  problem  is  given  y  =  fN,e(x )  determine  x.  One  can  then  prove  the  following 
theorem. 


Theorem  16.10.  In  the  random  oracle  model  if  we  model  H  as  a  random  oracle  then  the  RSA- 
FDH  signature  scheme  is  secure,  assuming  the  RSA  problem  is  hard.  In  particular  if  A  is  EUF-CMA 
adversary  against  the  RSA-FDH  signature  scheme  n  for  RSA  moduli  of  v  bits  in  length  which 
performs  qn  distinct  hash  function  queries,  then  there  is  an  algorithm  B  for  the  RSA  problem  such 
that 

AdvEUF-CMA(A;to)  =  qH  ■  Adv£SA(f?). 

Note  that  in  this  theorem  the  advantage  term  has  a  “security  loss”  of  qu-,  this  is  because  algorithm 
B  in  the  proof  needs  to  “guess”  into  which  query  it  should  embed  the  RSA  problem  challenge. 

Proof.  We  describe  an  algorithm  B  which  on  input  of  y  G  (Z/IVZ)*  outputs  x  =  fN\(y).  Without 
loss  of  generality  we  can  assume  that  algorithm  A  always  makes  a  hash  function  query  on  a  message 
m  before  it  is  passed  to  its  signing  oracle.  Indeed,  if  this  is  not  the  case  then  B  can  make  these 
queries  for  A. 

Algorithm  B  first  chooses  a  value  t  G  [1, . . . ,  qu\  and  throughout  keeps  a  numbered  record  of 
all  the  hash  queries  made.  Algorithm  B  takes  as  input  N ,  e  and  y  and  sets  the  public  key  to  be 
pk  <—  (V,  e).  Algorithm  B  maintains  a  hash  list  17-List  as  before,  which  is  initially  set  to  the  empty 
set.  The  public  key  is  then  passed  to  algorithm  A. 

When  algorithm  A  makes  a  hash  function  query  for  the  input  m,  algorithm  B  responds  as 
follows: 

•  If  there  exists  (m,  5,  _L)  G  17-List  then  return  5. 
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•  If  this  is  the  tth  distinct  query  to  the  hash  function  then  B  sets 

H- List  H- List  U  {(ra,  y,  _L)} 

and  responds  with  y.  We  let  m*  denote  this  message. 

•  Else  B  picks  s  <—  (Z/7VZ)*,  sets  h  <—  se  (mod  N)  and  sets 

H- List  H- List  U  {(ra,  h,  s)} 

and  responds  with  h. 

If  A  makes  a  signing  query  for  a  message  ra  then  algorithm  B  responds  as  follows. 

•  If  message  ra  is  equal  to  ra*  then  algorithm  B  stops  and  returns  fail. 

•  If  ra  7^  ra*  then  B  returns  the  value  s  such  that  (ra,  h,  s)  G  H- List. 

Let  A  terminate  with  output  (ra,  s)  and  without  loss  of  generality  we  can  assume  that  A  made  a 
hash  oracle  query  for  the  message  ra.  Now  if  m  ^  m*  then  B  terminates  and  admits  failure,  but  if 
ra  =  ra*  then  we  have 

/jv.e(s)  =  H(mt)  =  y. 

Hence  we  have  succeeded  in  inverting  /. 

In  analysing  algorithm  B  one  notices  that  if  A  terminates  successfully  then  (ra*,s)  is  an  exis¬ 
tential  forgery  and  so  ra*  was  not  asked  of  the  signing  oracle.  The  value  of  t  is  independent  of  the 
view  of  A ,  so  A  cannot  always  ask  for  the  signature  of  message  ra*  in  the  algorithm  rather  than 
not  ask  for  the  signature.  Hence,  roughly  speaking,  the  probability  of  success  of  B  is  1/qn  that  of 
the  probability  of  A  being  successful.  □ 


16.5.2.  RSA-PSS:  Another  way  of  securely  using  RSA  as  a  signature  algorithm  is  to  use  a  system 
called  RSA-PSS,  or  probabilistic  signature  scheme.  This  scheme  can  also  be  proved  secure  in  the 
random  oracle  model  under  the  assumption  that  the  RSA  problem  is  hard.  We  do  not  give  the 
details  of  the  proof  here  but  simply  explain  the  scheme,  which  appears  in  many  cryptographic 
standards.  The  advantage  of  RSA-PSS  over  RSA-FDH  is  that  one  only  requires  a  hash  function 
with  a  traditional  codomain,  e.g.  bit  strings  of  length  £,  rather  than  a  set  of  integers  modulo  another 
number. 

As  usual  one  takes  an  RSA  modulus  N,  a  public  exponent  e  and  a  private  exponent  d.  Suppose 
the  security  parameter  is  k,  i.e.  N  is  a  k- bit  number.  We  define  two  integers  ko  and  k\  so  that 
k0  +  h  <  k  —  1,  such  that  a  work  effort  of  2k°  and  2kl  is  considered  infeasible;  for  example  one 
could  take  k{  =  128  or  160.  We  then  define  two  hash  functions,  one  which  expands  data  and  one 
which  compresses  data  (just  like  in  RSA-OAEP): 

G  :  {0,  l}kl  — »  {0,  l}fc_fcl_1 
H  :  {0,1}*  — »  {0,l}fel. 


We  let 

Gi  :  {0,l}fcl  — »  (0,  l}fco 

denote  the  function  which  returns  the  first  /.'o  bits  of  G(w)  for  w  G  (0,  J }  '  '  and  we  let 

G2  :  {0,  l}fcl  — >  {0,  i}*-*o-ki-i 

denote  the  function  which  returns  the  last  k  —  ko  —  k\  —  1  bits  of  G(w)  for  w  G  (0,  l}fcl,  i.e. 
G(w)  =  Gi(w)||G2(w). 


Signing:  To  sign  a  message  m  the  private  key  holder  performs  the  following  steps: 

•  r  <—  {0,  l}k° . 

•  w  <—  H(m\\r). 

•  y  0||te||(Gfi(ie)  ®  r)\\G2(w). 

•  s  yd  (mod  N). 
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Verification:  To  verify  a  signature  (s,  m)  the  public  key  holder  performs  the  following 

•  y  <—  se  (mod  N). 

•  Split  y  into  the  components 


w\\a 


7 


where  b  is  one  bit  long,  w  is  k\  bits  long,  a  is  ko  bits  long  and  7  is  k  —  ko  —  k\  —  1  bits 
long. 

•  r  <—  a  0  G\(w). 

•  The  signature  is  verihed  as  correct  if  and  only  if  b  =  0  and  =  7  and  H(m\\r)  =  w. 

If  we  allow  the  modelling  of  the  hash  functions  G  and  H  by  random  oracles  then  one  can  show  that 
the  above  signature  algorithm  is  EUF-CMA  secure,  in  the  sense  that  the  existence  of  a  successful 
algorithm  to  find  forgeries  could  be  used  to  produce  an  algorithm  to  invert  the  RSA  function.  For 
the  proof  of  this  one  should  consult  the  Eurocrypt  1996  paper  of  Bellare  and  Rogaway  mentioned 
in  the  Further  Reading  section  at  the  end  of  this  chapter. 


16.5.3.  The  Digital  Signature  Algorithm:  We  have  already  presented  two  secure  digital  sig¬ 
nature  schemes,  namely  RSA-FDH  and  RSA-PSS.  You  may  ask  why  do  we  need  another  one? 

•  What  if  someone  breaks  the  RSA  algorithm  or  finds  that  factoring  is  easy? 

•  RSA  is  not  suited  to  some  applications  since  signature  generation  is  a  very  costly  operation. 

•  RSA  signatures  are  very  large;  some  applications  require  smaller  signature  footprints. 

One  algorithm  which  addresses  all  of  these  concerns  is  the  Digital  Signature  Algorithm,  or  DSA. 
One  sometimes  sees  this  referred  to  as  the  DSS,  or  Digital  Signature  Standard.  Although  originally 
designed  to  work  in  the  group  F*,  where  p  is  a  large  prime,  it  is  now  common  to  see  it  used  with 
elliptic  curves,  in  which  case  it  is  called  EC-DSA.  The  elliptic  curve  variants  of  DSA  run  very  fast 
and  have  smaller  footprints  and  key  sizes  than  almost  all  other  signature  algorithms.  We  shall  first 
describe  the  basic  DSA  algorithm  as  it  applies  to  finite  fields.  In  this  variant  the  security  is  based 
on  the  difficulty  of  solving  the  discrete  logarithm  problem  in  the  held  ¥p. 

Domain  Parameters:  Just  as  in  ElGamal  encryption  we  first  need  to  define  the  domain  param¬ 
eters,  which  are  identical  to  those  used  in  ElGamal  encryption.  These  are 

•  p  a  ‘large  prime’,  by  which  we  mean  one  with  around  2048  bits,  such  that  p—  1  is  divisible 
by  another  ‘medium  prime’  q  of  around  256  bits. 

•  g  an  element  of  F*  of  prime  order  <7,  i.e.  g  =  (mod  p)  /  1  for  some  r  G  F*. 

•  A  hash  function  H  which  maps  bit  strings  to  element  in  TL/qTL. 

The  domain  parameters  create  a  public  finite  abelian  group  G  of  prime  order  q  with  generator  g. 
Such  domain  parameters  can  be  shared  between  a  large  number  of  users. 

Key  Generation:  Again  key  generation  is  exactly  the  same  as  in  ElGamal  encryption.  The  private 
key  st  is  chosen  to  be  an  integer  x  <—  [0, . . . ,  q  —  1],  whilst  the  public  key  is  given  by  pt  =  h  <—  gx 
(mod  p). 

Signing:  DSA  is  a  signature  with  appendix  algorithm  and  the  signature  produced  consists  of  two 
elements  r,  s  G  TL/qTL.  To  sign  a  message  m  the  user  performs  the  following  steps: 

•  h  H{m). 

•  k  <—  (/TL/qTTL)* . 

•  r  <—  ( gk  (mod  p))  (mod  q). 

•  5  <—  (h  +  x  •  r)/k  (mod  q). 

The  signature  on  m  is  then  the  pair  (r,  s).  Notice  that  to  sign  we  utilize  a  secret  ephemeral  key 
on  every  signature.  One  issue  with  DSA  is  that  this  ephemeral  key  k  really  needs  to  be  kept  secret 
and  truly  random,  otherwise  attacks  like  the  earlier  partial  key  exposure  attacks  from  Section  15.5 
can  be  deployed. 
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Verification:  To  verify  the  signature  (r,  s)  on  the  message  m  for  the  public  key  /i,  the  verifier 
performs  the  following  steps. 

•  /if-  H(m). 

•  a  <—  h/s  (mod  q). 

•  b  <—  r/s  (mod  q). 

•  v  <—  ( ga  •  hb  (mod  p))  (mod  q). 

•  Accept  the  signature  if  and  only  if  v  =  r. 


As  a  baby  example  of  DSA  consider  the  following  domain  parameters 

q  =  13,  p  =  4  •  q  +  1  =  53  and  g  =  16. 

Suppose  the  public/private  key  pair  of  the  user  is  given  by  x  <—  3  and  h  <—  g3  (mod  p )  =  15.  Now, 
if  we  wish  to  sign  a  message  which  has  hash  value  h  =  5,  we  first  generate  an  ephemeral  secret  key, 
for  example  purposes  we  shall  take  k  <—  2,  and  then  we  compute 

r  <—  ( gk  (mod  p))  (mod  q)  =  5, 
s  <—  (h  +  x  •  r)/k  (mod  q )  =  10. 

To  verify  this  signature  the  recipient  computes 

a  ^  h/s  (mod  q)  =  7, 
b  ^  r/s  (mod  q)  =  7, 

v  <—  ( ga  •  y6  (mod  p))  (mod  q)  =  5. 

Note  v  =  r  and  so  the  signature  is  verified  correctly. 

The  DSA  algorithm  uses  the  subgroup  of  F*  of  order  q  which  is  generated  by  g.  The  private  key 
can  clearly  be  recovered  from  the  public  key  if  the  discrete  logarithm  problem  can  be  solved  in 
the  cyclic  group  (g)  of  order  q.  Thus,  taking  into  account  our  discussion  of  the  discrete  logarithm 
problem  in  Chapter  3,  we  require  for  security  that 

•  p  >  22048,  to  avoid  attacks  via  the  Number  Field  Sieve, 

•  q  >  2256  to  avoid  attacks  via  the  Baby-Step/Giant-Step  method. 

Hence,  to  achieve  the  rough  equivalent  of  128  bits  of  AES  strength  we  need  to  operate  on  integers 
of  roughly  2048  bits  in  length.  This  makes  DSA  slower  than  RSA,  since  the  DSA  operation  is  more 
complicated  than  RSA.  For  example,  the  verification  operation  for  an  equivalent  RSA  signatures 
requires  only  one  exponentiation  modulo  a  2048-bit  number,  and  even  that  is  an  exponentiation  by 
a  small  number.  For  DSA,  verification  requires  two  exponentiations  modulo  a  2048-bit  number.  In 
addition  the  signing  operation  for  DSA  is  more  complicated  than  the  procedure  for  RSA  signatures, 
due  to  the  need  to  compute  the  value  of  s,  which  requires  an  inversion  modulo  q. 

The  other  main  problem  is  that  the  DSA  algorithm  really  only  requires  to  work  in  a  finite  abelian 
group  of  size  2256,  but  since  the  integers  modulo  p  is  susceptible  to  an  attack  from  the  Number 
Field  Sieve  we  are  required  to  work  with  group  elements  of  2048  bits  in  size.  This  produces  a 
significant  performance  penalty. 

Luckily  we  can  generalize  DSA  to  an  arbitrary  finite  abelian  group  in  which  the  discrete  loga¬ 
rithm  problem  is  hard.  We  can  then  use  a  group  which  provides  a  harder  instance  of  the  discrete 
logarithm  problem,  for  example  the  group  of  points  on  an  elliptic  curve  over  a  finite  held.  To 
explain  this  generalization,  we  write  G  =  (g)  for  a  group  generated  by  g\  we  assume  that 

•  g  has  prime  order  q  >  2256, 

•  the  discrete  logarithm  problem  with  respect  to  g  is  hard, 
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•  there  is  a  public  function  f  such  that 

/  :  G  — >  TL/qTL. 

We  summarize  the  differences  between  DSA  and  EC-DSA  in  the  following  table. 


Quantity 

DSA 

EC-DSA 

G 

(5}  < 

(P)  <  E{ Fp) 

9 

p  e  E( Fp) 

y 

9X 

[x]P 

/(•) 

•  (mod  q) 

x-coord(-)  (mod  q) 

For  this  generalized  form  of  DSA  each  user  again  generates  a  secret  signing  key,  x.  The  public  key 
is  again  given  by  h  <—  gx .  Signatures  are  computed  via  the  steps 

•  h  H(m). 

•  k  <—  (/L/qL)* . 

•  r  <r-  f{gk). 

•  s  <—  (h  +  x  •  r)/k  (mod  q). 

To  verify  the  signature  (r,  s)  on  the  message  m  the  verifier  performs  the  following  steps. 

•  /if-  H(m). 

•  a  <—  h/s  (mod  q). 

•  b  <—  r/s  (mod  q). 

•  v  i —  f(ga  •  hb ). 

•  Accept  the  signature  if  and  only  if  v  =  r. 

You  should  compare  this  signature  and  verification  algorithm  with  that  given  earlier  for  DSA  and 
spot  where  they  differ.  When  used  for  EC-DSA  the  above  generalization  is  written  additively. 


EC-DSA  Example:  As  a  baby  example  of  EC-DSA  take  the  following  elliptic  curve 

y2  =  x3  +  x  +  3, 


over  the  field  F199.  The  number  of  elements  in  E(Fi99)  is  equal  to  q  —  197  which  is  a  prime;  the 
elliptic  curve  group  is  therefore  cyclic  and  as  a  generator  we  can  take  P  =  (1,  76).  As  a  private  key 
let  us  take  x  =  29,  and  so  the  associated  public  key  is  given  by 


Y 


x 


P  =  [29]  (1,76) 


(113,191). 


Suppose  the  holder  of  this  public  key  wishes  to  sign  a  message  with  hash  value  H{m)  equal  to  68. 
They  first  produce  a  random  ephemeral  key,  which  we  shall  take  to  be  k  =  153,  and  compute 


r  =  x-coord  ([k]P)  =  x-coord  ([153] (1,  76)) 
=  x-coord  ((185,  35))  =  185. 


Now  they  compute 


s  =  (H(m)  +  x  •  r)/k  (mod  q) 

=  (68  +  29  •  185) /153  (mod  197) 
=  78. 


The  signature  is  then  the  pair  (r,  s )  =  (185,  78). 


To  verify  this  signature  we  compute 

a  =  H{m)/s  (mod  q )  =  68/78  (mod  197)  =  112, 
b  =  r/s  (mod  q)  =  185/78  (mod  197)  =  15. 
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We  then  compute 


Z  =  [a]P+[b]Y  =  [112] (1 ,  76)  +  [15](113, 191) 
=  (111,  60)  +  (122, 140)  =  (185,  35). 


The  signature  is  now  verified  since  we  have 

r  =  185  =  x-coord(Z). 


It  is  believed  that  DSA  and  EC-DSA  do  provide  secure  signature  algorithms,  in  the  sense  of 
EUF-CMA,  however  no  proof  of  this  fact  is  known  in  the  standard  model  or  the  random  oracle 
model.  However,  if  instead  of  modelling  the  hash  function  as  an  ideal  object  (as  in  the  random 
oracle  model),  we  model  the  group  as  an  ideal  object  (something  called  the  generic  group  model) 
then  we  can  show  that  EC-DSA  is  EUF-CMA  secure.  But  this  uses  techniques  beyond  the  scope  of 
this  book. 


16.5.4.  Schnorr  Signatures:  There  are  many  variants  of  the  DSA  signature  scheme  based  on 
discrete  logarithms.  A  particularly  interesting  one  is  that  of  Schnorr  signatures.  We  present  the 
algorithm  in  the  general  case  and  allow  the  reader  to  work  out  the  differences  between  the  elliptic 
curve  and  finite  held  variants. 

Suppose  G  is  a  public  finite  abelian  group  generated  by  an  element  g  of  prime  order  q.  The 
public/private  key  pairs  are  just  the  same  as  in  DSA,  namely 

•  The  private  key  is  an  integer  x  in  the  range  0  <  x  <  q. 

•  The  public  key  is  the  element  h  gx . 


Signing:  To  sign  a  message  m  using  the  Schnorr  signature  algorithm  the  signer  performs  the 
following  steps: 


(1)  k  <—  TLjqL. 

(2)  r  <—  gk . 

(3)  e  <—  H(m\\r). 

(4)  5  <—  k  +  x  •  e  (mod  q). 


The  signature  is  then  given  by  the  pair  (e,  s).  Notice  how  the  hash  function  depends  both  on  the 
message  and  the  ephemeral  public  key  r;  we  will  see  this  is  crucial  in  order  to  establish  the  security 
results  below.  In  addition,  notice  that  the  signing  equation  is  easier  than  that  used  for  DSA,  as  we 
do  not  require  a  modular  inversion  modulo  q. 


Verification:  The  verification  step  is  very  simple: 

•  r  <—  gs  •  h~e. 

•  The  signature  is  accepted  if  and  only  if  e  = 


H(m\\r). 


Schnorr  Signature  Example:  As  an  example  of  Schnorr  signatures  in  a  finite  held  we  take  the 
domain  parameters  q  =  101,  p  =  607  and  g  =  601.  As  the  public/private  key  pair  we  assume 
x  <—  3  and  h  <—  gx  (mod  p )  =  391.  Then  to  sign  a  message  we  generate  an  ephemeral  key,  let’s 
take  k  <—  65,  and  compute  r  <—  gk  (mod  p)  =  223.  We  now  need  to  compute  the  hash  value 
e  <—  h(m\\r)  (mod  q).  Let  us  assume  that  we  compute  e  =  93;  then  the  second  component  of  the 
signature  is  given  by 

5  <—  k  +  x  •  e  (mod  q)  =  6 5  +  3*93  (mod  101)  =  41. 

We  leave  it  to  the  reader  to  check  that  the  signature  (e,  s )  verifies,  i.e.  that  the  verifier  recovers  the 
same  value  of  r. 
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Schnorr  Authentication  Protocols:  Schnorr  signatures  have  been  suggested  for  use  in  chal¬ 
lenge  response  mechanisms  in  smart  cards  since  the  response  part  of  the  signature  (the  value  of 
s)  is  particularly  easy  to  evaluate  because  it  only  requires  the  computation  of  a  single  modular 
multiplication  and  a  single  modular  addition.  No  matter  what  group  we  choose  this  final  phase 
only  requires  arithmetic  modulo  a  relatively  small  prime  number. 

To  see  how  one  uses  Schnorr  signatures  in  a  challenge  response  situation  we  give  the  following 
scenario.  You  wish  to  use  a  smart  card  to  authenticate  yourself  to  a  building  or  ATM  machine. 
The  card  reader  has  a  copy  of  your  public  key  h,  whilst  the  card  has  a  copy  of  your  private  key  x. 
Whilst  you  are  walking  around  the  card  is  generating  commitments,  which  are  ephemeral  public 
keys  of  the  form  r  =  gk. 

When  you  place  your  card  into  the  card  reader  the  card  transmits  to  the  reader  the  value  of 
one  of  these  precomputed  commitments.  The  card  reader  then  responds  with  a  challenge  message 
e.  Your  card  then  only  needs  to  compute 

5  =  k  +  x  •  e  (mod  q), 

and  transmit  it  to  the  reader,  which  then  verifies  the  ‘signature’  by  checking  whether 

gs  =  r  •  he . 

Notice  that  the  only  online  computations  needed  by  the  card  are  the  computations  of  the  values  of 
e  and  s,  which  are  both  easy  to  perform. 

In  more  detail,  if  we  let  C  denote  the  card  and  R  denote  the  card  reader  then  we  have 

C  — >  R:r  =  gk, 

R  — »  C  :  e, 

C  — >  R  :  5  =  k  +  xe  (mod  q). 

The  point  of  the  initial  commitment  is  to  stop  either  the  challenge  being  concocted  so  as  to  reveal 
your  private  key,  or  your  response  being  concocted  so  as  to  fool  the  reader.  A  three-phase  protocol 
consisting  of 

commitment  — >  challenge  — >  response 

is  a  common  form  of  authentication  protocol,  and  we  shall  see  more  protocols  of  this  nature  when 
we  discuss  zero- knowledge  proofs  in  Chapter  21. 


Schnorr  Signature  Security:  We  shall  now  prove  that  Schnorr  signatures  are  EUF-CMA  secure 
in  the  random  oracle  model.  The  proof  uses  something  called  the  forking  lemma,  which  we  present 
without  proof;  see  the  paper  of  Pointcheval  and  Stern  mentioned  at  the  end  of  this  chapter. 


Lemma  16.11  (Forking  Lemma).  Let  A  be  a  randomized  algorithm  with  inputs  (x,  h i, . . . ,  hq,r), 
where  r  is  the  randomness  used  by  A  drawn  from  a  distribution  R,  and  the  values  hi, ...  ,hq  are 
selected  from  a  set  R  uniformly  at  random. 

Assume  that  A  outputs  a  pair  (t,y).  Let  ca  be  the  probability  that  the  value  t  output  by  A  is 
in  the  range  [1  ,...,q\.  Define  the  algorithm  B  in  Algorithm  16.7,  which  has  input  x,  and  let  cb 
denote  the  probability  that  B  outputs  a  non-zero  tuple.  Then 


£B  > 


4 

q 


T4 

\n\ 


How  we  think  about  (and  use)  the  A  and  B  of  the  Lemma  is  as  follows.  Algorithm  A  is  assumed 
to  be  running  in  the  random  oracle  model,  and  the  hi  values  are  the  responses  it  receives  to  its 
random  oracle  queries.  There  is  a  special  query  which  A  uses  in  producing  its  answer  y;  this  is 
called  the  critical  query ,  and  it  is  denoted  by  t  in  Algorithm  16.7.  If  t  =  0  then  algorithm  A  is 
not  successful.  We  now  run  A  again,  with  the  same  random  tape  and  the  same  random  oracle,  up 
until  the  tth  query.  At  this  point  the  random  oracle  changes.  At  this  point,  which  recall  was  the 
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Algorithm  16.7:  Forking  algorithm  B 
r  <—  R. 

h\ , . . . ,  hq  i —  T-L . 

(t,y)  <-  A(x,hi, . . .  ,hq,r). 

if  t  =  0  then  return  (0,0). 

/q,  .  .  .  ,  hq  i —  H. 

<-  A(x,h1,...,ht-i,h't,...,hfq,r). 
if  t  7^  t!  or  ht  =  hft  then  return  (0,0). 
return  (p,p7). 


critical  query,  the  second  execution  of  A  “forks”  down  another  path  (giving  the  lemma  its  name). 
The  lemma  tells  us  a  lower  bound  on  the  probability  that  the  two  runs  of  A  result  in  the  same 
value  for  the  critical  query,  assuming  it  is  distinct. 

The  lemma  is  important  in  analysing  signature  schemes  which  are  of  the  form  commit ,  challenge , 
response.  To  realize  such  signatures  we  use  the  following  notation.  To  sign  a  message 


•  The  signer  produces  a  (possibly  empty)  commitment  a\  (the  commitment). 

•  The  signer  computes  e  =  H(ai\\m)  (the  challenge). 

•  The  signer  computes  02  which  is  the  ‘signature’  on  u\  and  e  (the  response). 


We  label  the  output  of  the  signature  schemes  as  (a i,  H{a\  ||m),  02)  so  as  to  keep  track  of  the  exact 
hash  query;  DSA,  EC-DSA  and  Schnorr  signatures  are  all  of  this  form: 


•  DSA  :  <j\  —  0,  e  =  iJ(ra),  02  =  (r,  (e-\-x-r)/k  (mod  g))  where  r  =  ( gk  (mod  p)) 
(mod  q). 

•  EC-DSA  :  cm  =  0,  e  =  iJ(ra),  02  =  (r,  (e  +  x  •  r)/k  (mod  q)),  where  r  =  x-coord ([k]G). 

•  Schnorr  signatures:  u\  —  gk ,  e  =  H(ai\\m)  02  =  x  •  e  +  k  (mod  q). 


In  all  of  these  schemes  the  hash  function  is  assumed  to  have  codomain  equal  to  ¥q. 

Recall  that  in  the  random  oracle  model  the  hash  function  is  allowed  to  be  cooked  up  by  the 
algorithm  B  to  do  whatever  it  likes.  Suppose  an  adversary  A  can  produce  an  existential  forgery 
on  a  message  m  with  non- negligible  probability  in  the  random  oracle  model.  Hence,  the  output  of 
the  adversary  is 

(to,  cti,  e,  <r2). 


We  can  assume  that  the  adversary  makes  the  critical  hash  query  for  the  forged  message,  e  = 
i7(0q||m),  since  otherwise  we  can  make  the  query  for  the  adversary  ourselves. 

Algorithm  B  now  runs  the  adversary  A  again,  just  as  in  the  forking  lemma,  with  the  same 
random  tape  and  the  modified  random  oracle.  Up  until  the  critical  query,  the  hash  queries  were 
answered  the  same  way  as  before,  so  we  have  that  the  execution  of  A  in  both  runs  is  identical  up 
until  the  critical  query.  If  Algorithm  16.7  is  successful  this  means  the  two  executions  output  two 
tuples 

y  =  (m,cri,e,cr2)  and  y  =  (m7,  cq,  e7,  cr^), 

where  e  and  e7  are  the  outputs  from  the  critical  query,  but  the  inputs  to  this  query  are  the  same.  In 
other  words  we  have  m  =  mr  and  o\  —  a[ .  This  last  equation  does  not  give  us  anything  in  the  case 
of  DSA  or  EC-DSA,  since  in  those  cases  u\  is  always  equal  to  0.  However,  for  Schnorr  signatures 
we  obtain  something  useful,  since  we  find  gk  =  gk  ,  where  k  and  kr  are  the  underlying  ephemeral 
keys  of  the  two  signatures.  Note  that  A  might  not  even  know  k  and  k'  in  running  her  attack  code. 
This  allows  us  to  deduce  k  =  k' ,  and  hence  to  recover  the  secret  key  x  from  the  equation 


e 


x  = 


02  -  02  ' 
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Notice  that  the  denominator  here  is  non-zero  by  assumption.  This  is  the  basic  idea  behind  the 
proof  of  the  following  theorem. 

Theorem  16.12.  In  the  random  oracle  model  let  A  denote  a  EUF-CMA  adversary  against  Schnorr 
signatures  with  advantage  e,  making  qu  queries  to  its  hash  function  H .  Then  there  is  an  adversary 
B  against  the  discrete  logarithm  problem  with  advantage  e'  such  that 

,  e2  e 
6  > - . 

Q  Q 


Proof.  Algorithm  B  has  as  input  g  and  h  =  gx  and  it  wishes  to  find  x.  The  value  h  will  be  used 
as  the  public  key  by  algorithm  A.  We  first  ‘package’  A  into  an  algorithm  A'  which  does  not  require 
access  to  a  signing  oracle  as  follows.  The  algorithm  A'  will  take  as  input  (h,  hi, . . . ,  hq,  r),  where  r 
is  an  entry  from  the  set  of  possible  random  tapes  of  algorithm  A.  When  A  makes  a  signing  query 
on  the  message  m  then  algorithm  A'  executes  the  following  steps: 

•  Take  the  next  hash  query  value  input  to  A7,  let  this  be  value  hi. 

•  s  <—  Z/gZ. 

•  r  <—  g  /h( . 

•  Define  H(m\\r)  =  hi.  If  this  value  has  already  been  defined  then  pick  another  value  of  s. 


The  hash  function  queries  are  handled  in  the  usual  way,  using  the  inputs  hi, . . . ,  hq.  If  A  does  not 
terminate  with  a  forged  signature  then  A'  output  (0,0),  otherwise  it  outputs  (t,y)  where  t  is  the 
index  of  the  critical  hash  query  and 

V  =  ct2), 

with  ht  =  e.  Algorithm  A'  is  now  an  algorithm  which  we  can  use  in  the  forking  lemma.  This  gives 
us  an  algorithm  B  which  will  produce  two  values  y  and  yr ,  from  which  we  can  recover  x  via  the 
above  method.  □ 


16.5.5.  Nyberg— Rueppel  Signatures:  What  happens  when  we  want  to  sign  a  general  message 
which  is  itself  quite  short?  It  may  turn  out  that  the  signature  could  be  longer  than  the  message. 
Recall  that  RSA  can  be  used  either  as  a  scheme  with  appendix  or  as  a  scheme  with  message  recovery. 
So  far  none  of  our  discrete-logarithm-based  schemes  can  be  used  with  message  recovery.  We  end 
this  section  by  giving  an  example  scheme  which  does  have  the  message  recovery  property,  called 
the  Nyberg-Rueppel  signature  scheme,  which  is  based  on  discrete  logarithms  in  some  public  finite 
abelian  group  G. 

Many  signature  schemes  with  message  recovery  require  a  public  redundancy  function  R.  This 
function  maps  actual  messages  over  to  the  data  which  is  actually  signed.  This  acts  rather  like  a 
hash  function  does  in  the  schemes  based  on  signatures  with  appendix.  However,  unlike  a  hash 
function  the  redundancy  function  must  be  easy  to  invert.  As  a  simple  example  we  could  take  R  to 
be  the  function 

{0,1}”/2  — ►  {0,1}” 
m  i — >  m\\m. 

We  assume  that  the  codomain  of  R  can  be  embedded  into  the  group  G.  In  our  description  we  shall 
use  the  integers  modulo  p,  i.e.  G  —  F*,  and  as  usual  we  assume  that  a  large  prime  q  divides  p  —  1 
and  that  g  is  a  generator  of  the  subgroup  of  order  q.  Once  again  the  public/private  key  pair  is 
given  as  a  discrete  logarithm  problem  pt  <—  h  =  gx . 

Signing:  Nyberg-Rueppel  signatures  are  then  produced  as  follows: 

(1)  k  <—  Z/gZ. 

(2)  r  gk  (mod  p). 

(3)  e  <—  R(m)  •  r  (mod  p). 

(4)  5  <—  x  •  e  +  k  (mod  q). 
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The  signature  is  then  the  pair  (e,  s). 

Verification  and  Recovery:  From  the  pair  (e,  s)  and  the  public  key  h  we  need  to 

•  Verify  that  the  signature  comes  from  the  user  with  public  key  h, 

•  Recover  the  message  m  from  the  pair  (e,  s). 

This  is  performed  as  follows: 

(1)  ui  gs  •  h~e  =  gs~e'x  =  gk  (mod  p). 

(2)  U2^e/ui  (mod  p). 

(3)  Verify  that  U2  lies  in  the  range  of  the  redundancy  function,  e.g.  we  must  have  U2  =  R(m)  = 
m\\m.  If  this  does  not  hold  then  reject  the  signature. 

(4)  Recover  the  message  nn  —  R~l(u2)  and  accept  the  signature. 

Example:  As  an  example  we  take  the  domain  parameters  q  =  101,  p  =  607,  g  =  601,  and  as  the 
redundancy  function  we  take  R(m)  =  m  +  24  •  m,  where  a  message  m  must  he  in  [0, . . . ,  15].  As  the 
public/private  key  pair  we  assume  x  <—  3  and  h  <—  gx  (mod  p)  =  391.  To  sign  the  message  m  =  12 
we  compute  an  ephemeral  key  k  <—  45  and  r  gk  (mod  p)  =  143.  Since  R(m)  =  m  +  24  •  rri  we 
have  R(m)  =  204.  We  then  compute  e  <—  R(m)  •  r  (mod  p)  =  36,  5  <—  x  •  e  +  k  (mod  q)  =  52.  The 
signature  is  then  the  pair  (e,  s)  =  (36,  52). 

We  now  show  how  this  signature  is  verified  and  the  message  recovered.  We  first  compute 
u\  —  gs  •  h~e  =  143.  Notice  how  the  verifier  has  computed  u\  to  be  the  same  as  the  value  of  r 
computed  by  the  signer.  The  verifier  now  computes  U2  =  e/u\  (mod  p)  =  204.  The  verifier  now 
checks  that  U2  =  204  is  of  the  form  m  +  24m  for  some  value  of  m  E  [0, . . . ,  15].  We  see  that  U2 
is  of  this  form  and  so  the  signature  is  valid.  The  message  is  then  recovered  by  solving  for  m  in 
m  +  24m  =  204,  from  which  we  obtain  m  =  12. 

16.6.  Schemes  Avoiding  Random  Oracles 

In  the  previous  sections  we  looked  at  signature  and  encryption  schemes  which  can  be  proved  secure 
in  the  so-called  ‘random  oracle  model’.  A  proof  in  the  random  oracle  model  only  provides  evidence 
that  a  scheme  may  be  secure  in  the  real  world,  it  does  not  guarantee  security  in  the  real  world. 
We  can  interpret  a  proof  in  the  random  oracle  model  as  saying  that  if  an  adversary  against  the 
real-world  scheme  exists  then  that  adversary  must  make  use  of  the  specific  hash  function  employed. 

In  this  section  we  sketch  recent  ways  in  which  researchers  have  tried  to  construct  signature 
and  encryption  algorithms  which  do  not  depend  on  the  random  oracle  model,  i.e.  schemes  in  the 
standard  model.  We  shall  only  consider  schemes  which  are  practical,  and  we  shall  only  sketch 
the  proof  ideas.  Readers  interested  in  the  details  of  proofs  or  in  other  schemes  should  consult  the 
extensive  literature  in  this  area. 

What  we  shall  see  is  that  whilst  quite  natural  encryption  algorithms  can  be  proved  secure 
without  the  need  for  random  oracles,  the  situation  is  quite  different  for  signature  algorithms.  This 
should  not  be  surprising  since  signature  algorithms  make  extensive  use  of  hash  functions  for  their 
security.  Hence,  we  should  expect  that  they  impose  stricter  restraints  on  such  hash  functions,  which 
may  not  actually  be  true  in  the  real  world. 

16.6.1.  The  Cramer— Shoup  Encryption  Scheme:  Unlike  the  case  of  signature  schemes  in 
the  standard  model,  for  encryption  algorithms  one  can  produce  provably  secure  systems  which  are 
practical  and  close  to  those  used  in  ‘real  life’.  The  Cramer-Shoup  encryption  scheme  requires  as 
domain  parameters  a  finite  abelian  group  G  of  prime  order  q.  In  addition  we  require  a  one-way 
family  of  hash  functions.  This  is  a  family  {Hi}  of  hash  functions  for  which  it  is  hard  for  an  adversary 
to  choose  an  input  x,  then  to  draw  a  random  hash  function  17^,  and  then  to  find  a  different  input 
y  so  that 


Hi(x)  =  Hi(y). 
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Key  Generation:  A  public  key  in  the  Cramer-Shoup  scheme  is  chosen  as  follows.  First  the 
following  random  elements  are  selected 

92  G, 

xi,x2,yi,y2,z  <-  Z/gZ. 

The  user  then  computes  the  following  elements 

c  <-  gixi  -g2x\ 
d  <-  gim  •  g2m, 
h  V-  g\z . 

The  user  hnally  chooses  a  hash  function  H  from  the  universal  one-way  family  of  hash  functions 
and  outputs  the  public  key  pt  V-  (#i,  #2,  c,  d,  h,  H),  whilst  keeping  secret  the  private  key  si  V - 
(x  1 ,  X2  5  y\  5 1/2  5  • 


Encryption:  The  encryption  algorithm  proceeds  as  follows,  which  is  very  similar  to  ElGamal 
encryption.  The  message  m  is  considered  as  an  element  of  G,  and  encryption  proceeds  as  follows: 


r  V-  Z/gZ, 

ui  gir > 

u2  ^ _  g2r, 

e  V-  m  •  hr , 


a 

n 


-^(^1 1|^2  || e). 


C 


d 


ra 


The  ciphertext  is  then  the  quadruple  (iq,  ^2?  e,  v). 


Decryption:  On  receiving  this  ciphertext  the  owner  of  the  private  key  can  recover  the  message  as 
follows:  First  they  compute  a  V-  H(ui\\u2\\e)  and  test  whether 

uiXl+yia  •  u2X2+V2a  =  v. 

If  this  equation  does  not  hold  then  the  ciphertext  should  be  rejected.  If  this  equation  holds  then 
the  receiver  can  decrypt  the  ciphertext  by  computing 

e 

m  < - . 

U\z 

Notice  that,  whilst  very  similar  to  ElGamal  encryption,  the  Cramer-Shoup  encryption  scheme  is 
much  less  efficient.  Hence,  whilst  provably  secure  it  is  not  used  much  in  practice. 


Security:  To  show  that  the  scheme  is  provably  secure,  under  the  assumption  that  the  DDH  prob¬ 
lem  is  hard  and  that  H  is  chosen  from  a  universal  one-way  family  of  hash  functions,  we  assume 
we  have  an  adversary  A  against  the  scheme  and  show  how  to  use  A  in  another  algorithm  B  which 
tries  to  solve  the  DDH  problem. 

One  way  to  phrase  the  DDH  problem  is  as  follows:  Given  (gi,  <72?  uh  u2)  €  G  determine  whether 
this  quadruple  is  a  random  quadruple  or  we  have  u\  —  g\r  and  U2  =  g2r  for  some  value  of  r  G  Z/gZ. 
So  algorithm  B  will  take  as  input  a  quadruple  (#i,  #2,  Ri,  R2)  £  G  and  try  to  determine  whether 
this  is  a  random  quadruple  or  a  quadruple  related  to  the  Diffie-Hellman  problem. 

Algorithm  B  first  needs  to  choose  a  public  key,  which  it  does  in  a  non-standard  way,  by  first 
selecting  the  random  elements 

xi,x2,yi,y2,zi,z2  €  Z/qZ. 
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Algorithm  Ba  then  computes  the  following  elements 

gixi  <g2x\ 
d  <-  givi  ■  g2m, 
h  <-  gizi  ■  g2Z2- 


Finally  B  chooses  a  hash  function  H  from  the  universal  one-way  family  of  hash  functions  and 
outputs  the  public  key  pt  <—  (#1,  c,  d,  h,  H).  Notice  that  the  part  of  the  public  key  corresponding 
to  h  has  been  chosen  differently  than  in  the  real  scheme,  but  that  algorithm  A  will  not  be  able 
to  detect  this  change.  Algorithm  B  now  runs  algorithm  A,  responding  to  decryption  queries  of 
u2,  ef  v  )  by  computing 


rn 


ui'Zlu'2Z2 

after  performing  the  standard  check  on  validity  of  the  ciphertext. 

At  some  point  A  calls  its  C\r  oracle  on  the  two  plaintexts  mo  and  m\.  Algorithm  B  chooses  a 
bit  b  at  random  and  computes  the  target  ciphertext  as 


e  G-  mb  •  (uiZl  •  U2Z2) , 
a  <—  H (u\ \\u2 1 1 e) , 

V  <T-  U!Xl+yia  .  U2X2+ma . 


The  target  ciphertext  is  then  the  quadruple  (iq,  U2,e,v).  Notice  that  when  the  input  to  B  is  a 
legitimate  DDH  quadruple  then  the  target  ciphertext  will  be  a  valid  encryption,  but  when  the  input 
to  B  is  not  a  legitimate  DDH  quadruple  then  the  target  ciphertext  is  highly  likely  to  be  an  invalid 
ciphertext.  This  target  ciphertext  is  then  returned  to  the  adversary  A. 

If  the  adversary  outputs  the  correct  value  of  b  then  we  suspect  that  the  input  to  B  is  a  valid 
DDH  quadruple,  whilst  if  the  output  is  wrong  then  we  suspect  that  the  input  to  B  is  not  valid. 
This  produces  a  statistical  test  to  detect  whether  the  input  to  B  was  valid  or  not.  By  repeating 
this  test  a  number  of  times  we  can  produce  as  accurate  a  statistical  test  as  we  want. 

Note  that  the  above  is  only  a  sketch.  We  need  to  show  that  the  view  of  the  adversary  A  in 
the  above  game  is  no  different  from  that  in  a  real  attack  on  the  system,  otherwise  A  would  know 
something  was  not  correct.  For  example  we  need  to  show  that  the  responses  B  makes  to  the 
decryption  queries  of  A  cannot  be  distinguished  from  a  true  decryption  oracle.  For  further  details 
one  should  consult  the  full  proof  in  the  paper  mentioned  in  the  Further  Reading  section. 


16.6.2.  Cramer— Shoup  Signatures:  We  have  already  remarked  that  signature  schemes  which 
are  provably  secure,  without  the  random  oracle  model,  are  hard  to  come  by.  They  also  appear 
somewhat  contrived  compared  with  the  schemes  such  as  RSA-PSS,  DSA  or  Schnorr  signatures 
which  are  used  in  real  life.  The  first  such  provably  secure  signature  scheme  in  the  standard  model 
was  by  Goldwasser,  Micali  and  Rivest.  This  was  however  not  very  practical  as  it  relied  on  messages 
being  associated  with  leaves  of  a  binary  tree,  and  each  node  in  the  tree  needed  to  be  authenticated 
with  respect  to  its  parent.  This  made  the  resulting  scheme  far  too  slow. 

However,  even  with  today’s  knowledge  the  removal  of  the  use  of  random  oracles  comes  at  a 
significant  price.  We  need  to  make  stronger  intractability  assumptions  than  we  have  otherwise 
made.  In  this  section  we  introduce  a  new  RSA-based  problem  called  the  Flexible  RSA  Problem. 
This  is  a  potentially  easier  problem  than  ones  we  have  met  before,  hence  the  assumption  that  the 
problem  is  hard  is  a  much  stronger  assumption  than  before.  The  strong  RSA  assumption  is  the 
assumption  that  the  Flexible  RSA  Problem  is  hard. 

Definition  16.13  (Flexible  RSA  Problem).  Given  an  RSA  modulus  N  =  p  •  q  and  a  random 
c  G  (Z/7VZ)*  find  e  >  1  and  m  G  (Z /AfZ)*  such  that 

me  =  c. 
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Clearly  if  we  can  solve  the  RSA  problem  then  we  can  solve  the  Flexible  RSA  Problem.  This 
means  that  the  strong  RSA  assumption  is  a  stronger  intractability  assumption  than  the  standard 
RSA  assumption,  in  that  it  is  conceivable  that  in  the  future  we  may  be  able  to  solve  the  Flexible 
RSA  Problem  but  not  the  traditional  RSA  problem.  However,  at  present  we  conjecture  that  both 
problems  are  equally  hard. 

The  Cramer-Shoup  signature  scheme  is  based  on  the  strong  RSA  assumption  and  is  provably 
secure,  without  the  need  for  the  random  oracle  model.  The  main  efficiency  issue  with  the  scheme 
is  that  the  signer  needs  to  generate  a  new  prime  number  for  each  signature  produced,  which  can 
be  rather  costly.  In  our  discussion  below  we  shall  assume  H  is  a  ‘standard’  hash  function  which 
outputs  bit  strings  of  256  bits,  which  we  interpret  as  256-bit  integers  as  usual. 

Key  Generation:  To  generate  the  public  key,  we  create  an  RSA  modulus  N  which  is  the  product 
of  two  “safe”  primes  p  and  q,  i.e.  p  <—  2  •  pf  +  1  and  q  <—  2  •  q'  +  1  where  p'  and  q'  are  primes.  We 
also  choose  two  random  elements 

h,x  G  Qn , 

where,  as  usual,  Qn  is  the  set  of  quadratic  residues  modulo  N.  We  also  create  a  random  256-bit 
prime  e' .  The  public  key  consists  of 

(iV,  h,  x,  e  ) 

and  the  private  key  is  the  factors  p  and  q. 


Signing:  To  sign  a  message  the  signer  generates  another  256-bit  prime  number  e  and  another 
random  element  y'  G  Qn •  Since  they  know  the  factors  of  N,  the  signer  can  compute  the  solution 
y  to  the  equation 

,/n\  i/e 


V  =  [x 


■  hH x) 


(mod  N), 


where  x'  satisfies 


x'  =  y’e'  ■ 


The  output  of  the  signer  is  (e,y,yf). 


Verification:  To  verify  a  message  the  verifier  first  checks  that  er  is  an  odd  number  satisfying 
e/e7.  Then  the  verifier  computes 

x'  <-  y,e'  ■  h~H ^ 

and  then  checks  that 

x  =  ye-h~H^\ 


Security:  On  the  assumption  that  H  is  a  collision  resistant  hash  function  and  the  Flexible  RSA 
Problem  is  hard,  one  can  prove  that  the  above  scheme  is  secure  against  active  adversaries.  We 
sketch  the  most  important  part  of  the  proof,  but  the  full  details  are  left  to  the  interested  reader  to 
look  up  in  the  paper  mentioned  at  the  end  of  this  chapter. 

Assume  the  adversary  makes  t  queries  to  a  signing  oracle.  We  want  to  use  the  adversary  A  to 
create  an  algorithm  B  to  break  the  strong  RSA  assumption  for  the  modulus  N .  Before  setting  up 
the  public  key  for  input  to  algorithm  A,  the  algorithm  B  first  decides  on  what  prime  values  it 
will  output  in  the  signature  queries.  Then,  having  knowledge  of  the  e^,  the  algorithm  B  concocts 
values  for  the  h  and  x  in  the  public  key,  so  that  it  always  knows  the  eRh  root  of  h  and  x. 

Thus  when  given  a  signing  query  for  a  message  ?rq,  algorithm  B  can  then  compute  a  valid 
signature,  without  knowing  the  factorization  of  N ,  by  generating  y[  Qn  & t  random  and  then 
computing 

x\  y'i  •  (mod  N ) 

and  then 

Vi  «-  x1/ei  ■  (h1/ei)HW  (mod  N), 
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the  signature  being  given  by 


(■ m,yi,y'i )• 

The  above  basic  signing  simulation  is  modified  in  the  full  proof,  depending  on  what  type  of  forgery 
algorithm  A  is  producing.  But  the  basic  idea  is  that  B  creates  a  public  key  to  enable  it  to  respond 
to  every  signing  query  in  a  valid  way. 


Chapter  Summary 


•  ElGamal  encryption  is  a  system  based  on  the  difficulty  of  the  Diffie-Hellman  problem 

(DHP). 

•  Paillier  encryption  is  a  scheme  which  is  based  on  the  Composite  Decision  Residuosity 
Problem  (CDRP). 

•  The  random  oracle  model  is  a  computational  model  used  in  provable  security.  A  proof  in 
the  random  oracle  model  does  not  mean  the  system  is  secure  in  the  real  world,  it  only 
provides  evidence  that  it  may  be  secure. 

•  In  the  random  oracle  model  one  can  prove  that  the  ubiquitous  RSA  encryption  method, 
namely  RSA-OAEP,  is  secure. 

•  We  presented  the  KEM-DEM  paradigm  for  designing  public  key  encryption  schemes. 

•  We  gave  the  RSA-KEM  (resp.  DHIES-KEM)  and  showed  why  it  is  secure,  assuming  the 
RSA  (resp.  Gap  Diffie-Hellman)  problem  is  hard. 

•  In  the  random  oracle  model  the  two  main  RSA  based  signature  schemes  used  in  Teal  life’ 
are  also  secure,  namely  RSA-FDH  and  RSA-PSS. 

•  DSA  is  a  signature  algorithm  based  on  discrete  logarithms;  it  has  reduced  bandwidth 
compared  with  RSA  but  is  slower.  EC-DSA  is  the  elliptic  curve  variant  of  DSA;  it  has 
reduced  bandwidth  and  greater  efficiency  compared  to  DSA. 

•  In  the  random  oracle  model  one  can  use  the  forking  lemma  to  show  that  the  Schnorr 
signature  scheme  is  secure. 

•  The  Flexible  RSA  Problem  is  a  natural  weakening  of  the  standard  RSA  problem. 

•  The  Cramer-Shoup  encryption  scheme  is  provably  secure,  without  the  random  oracle 
model,  assuming  the  DDH  problem  is  hard.  It  is  around  three  times  slower  than  ElGamal 
encryption. 


Further  Reading 

The  paper  by  Cramer  and  Shoup  on  public  key  encryption  presents  the  basics  of  hybrid  encryp¬ 
tion  in  great  detail,  as  well  as  the  scheme  for  public  key  encryption  without  random  oracles.  The 
other  paper  by  Cramer  and  Shoup  presents  their  signature  scheme.  The  DHIES  scheme  was  first 
presented  in  the  paper  by  Abdalla  et  al.  A  good  paper  to  look  at  for  various  KEM  constructions  is 
that  by  Dent.  The  full  proof  of  the  security  of  RSA-OAEP  is  given  in  the  paper  of  Fujisaki  et  al. 

A  good  description  of  the  forking  lemma  and  its  applications  is  given  in  the  article  of  Pointcheval 
and  Stern.  The  random  oracle  model  and  a  number  of  applications  including  RSA-FDH  and  RSA- 
PSS  are  given  in  the  papers  of  Bellare  and  Rogaway. 
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CHAPTER  17 


Cryptography  Based  on  Really  Hard  Problems 


Chapter  Goals 

•  To  introduce  the  concepts  of  complexity  theory  needed  to  study  cryptography. 

•  To  understand  why  complexity  theory  on  its  own  cannot  lead  to  secure  cryptographic 
systems. 

•  To  introduce  the  idea  of  random  self-reductions. 

•  To  explain  the  Merkle-Hellman  system  and  why  it  is  weak. 

•  To  sketch  the  idea  behind  worst-case  to  average-case  reductions  for  lattices. 

•  To  introduce  the  Learning  with  Errors  problem,  and  the  Ring-Learning  with  Errors  prob¬ 
lem,  and  to  describe  a  public  key  encryption  scheme  based  on  them. 

•  To  sketch  how  to  extend  this  encryption  scheme  to  produce  a  fully  homomorphic  encryp¬ 
tion  scheme. 


17.1.  Cryptography  and  Complexity  Theory 

Up  until  now  we  have  looked  at  basing  cryptography  on  problems  which  are  believed  to  be  hard,  e.g. 
that  AES  is  a  PRF,  that  factoring  a  product  of  large  primes  is  hard,  that  finding  discrete  logarithms 
is  hard.  But  there  is  no  underlying  reason  why  these  problems  should  be  hard.  Computer  Science 
gives  us  a  whole  theory  of  categorizing  hard  problems,  called  complexity  theory.  Yet  none  of  our 
hard  problems  appear  to  be  what  a  complexity  theorist  would  call  hard.  Indeed,  in  comparison  to 
what  complexity  theorists  discuss,  factoring  and  discrete  logarithms  are  comparatively  easy. 

We  are  going  to  need  to  recap  some  basic  complexity  theory,  most  of  which  one  can  find  in 
a  basic  computer  science  undergraduate  curriculum,  or  by  reading  the  book  by  Goldreich  in  the 
Further  Reading  section  of  this  chapter. 

17.1.1.  Decision  and  Search  Problems:  A  decision  problem  W  is  a  problem  with  a  yes/no 
answer,  which  has  inputs  (called  instances)  i  coded  in  some  way  (for  example  as  a  binary  string 
of  some  given  size  n).  Often  one  has  a  certain  set  S  of  instances  in  mind  and  one  is  asking  “Is 
l  G  A?” .  For  example  one  could  have 

•  l  is  the  encoding  of  an  integer,  and  S  is  the  set  of  all  primes.  Hence  the  decision  problem 
is:  Given  an  integer,  TV,  say  whether  it  is  prime  or  not. 

•  l  is  the  encoding  of  a  graph,  and  S  is  the  subset  of  all  graphs  which  are  colourable  using  k 
colours  only.  Recall  that  a  graph  consisting  of  vertices  V  and  edges  E  is  colourable  by  k 
colours  if  one  can  assign  a  colour  (or  label)  to  each  vertex  so  that  no  two  vertices  connected 
by  an  edge  share  the  same  colour.  Hence  the  decision  problem  is:  Given  a  graph,  G,  say 
whether  it  is  colourable  using  only  k  colours. 

Whilst  we  have  restricted  ourselves  to  decision  problems,  one  can  often  turn  a  standard  compu¬ 
tational  problem  into  a  decision  problem.  As  an  example  of  this  consider  the  cryptographically 
important  knapsack  problem. 
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Definition  17.1  (Decision  Knapsack  Problem).  Given  a  pile  ofn  items,  with  different  weights 
is  it  possible  to  put  items  into  a  knapsack  to  make  a  specific  weight  S?  In  other  words,  do  there 
exist  bi  G  {0, 1}  such  that  S  =  b\  •  w\  +  62  *  ^2  +  *  *  *  +  bn  •  wn  ? 

As  stated  above  the  knapsack  problem  is  a  decision  problem  but  we  could  ask  for  an  algorithm  to 
actually  search  for  the  values  bi. 

Definition  17.2  (Knapsack  (Search)  Problem).  Given  a  pile  of  n  items,  with  different  weights 
Wi,  is  it  possible  to  put  items  into  a  knapsack  to  make  a  specific  weight  S  ?  If  so  can  one  find  the 
bi  G  {0, 1}  such  that  S  =  b\  •  w\  +  62  •  W2  +  •  •  •  +  bn  •  wn  ?  We  assume  only  one  such  assignment  of 
weights  is  possible. 

Note  that  the  time  taken  to  solve  either  of  these  variants  of  the  knapsack  problem  seems  to  grow 
in  the  worst  case  as  an  exponential  function  of  the  number  of  weights.  We  can  turn  an  oracle 
for  the  decision  knapsack  problem  into  one  for  the  knapsack  problem  proper.  To  see  this  consider 
Algorithm  17.1  which  assumes  an  oracle  0(w\,  ...,wn,S)  for  the  decision  knapsack  problem. 

Algorithm  17.1:  Knapsack  algorithm,  assuming  a  decision  knapsack  oracle 

if  0(w\,  ...,wn,S)  =false  then  return  false. 

T  <r-  S. 

b\  i —  0, . . . ,  bn  i —  0. 
for  i  =  1  to  n  do 

if  T  =  0  then  return  {b\, ...,  bn). 
if  0{wi+ 1,  ...,wn,T  —  Wi)  =true  then 
T  ^  T  —  Wi. 
bi  4—  1. 


A  decision  problem  VV ,  characterized  by  a  set  S ,  is  said  to  lie  in  complexity  class  V  if  there  is 
an  algorithm  which  takes  any  instance  l ,  and  decides  whether  or  not  l  G  S  in  polynomial  time.  We 
measure  time  in  terms  of  bit  operations  and  polynomial  time  means  the  number  of  bit  operations 
is  bounded  by  some  polynomial  function  of  the  input  size  of  the  instance  l. 

The  problems  which  he  in  complexity  class  V  are  those  for  which  we  have  an  “efficient”  solution 
algorithm1.  In  other  words  things  in  complexity  class  V  are  those  things  which  are  believed  to  be 
easy  to  compute.  For  example. 

•  Given  integers  x,  y  and  z  do  we  have  z  =  x  -  y,  i.e.  is  multiplication  easy? 

•  Given  a  ciphertext  c,  a  key  k  and  a  plaintext  ra,  is  c  the  encryption  of  m  under  your 
favourite  encryption  algorithm? 

Of  course  in  the  last  example  I  have  assumed  your  favourite  encryption  algorithm  has  an  encryp¬ 
tion/decryption  algorithm  which  runs  in  polynomial  time.  If  your  favourite  encryption  algorithm 
is  not  of  this  form,  then  one  must  really  ask  how  have  you  read  so  far  in  this  book? 


17.1.2.  The  class  AfV:  A  decision  problem  lies  in  complexity  class  AfV,  called  non-deterministic 
polynomial  time,  if  for  every  instance  for  which  the  answer  is  yes ,  there  exists  a  witness  for  this 
which  can  be  checked  in  polynomial  time.  If  the  answer  is  no  we  do  not  assume  the  algorithm 
terminates,  but  if  it  does  it  must  do  so  by  answering  no.  One  should  think  of  the  witness  as  a  proof 
that  the  instance  1  lies  in  the  subset  S.  Examples  include 

•  The  problem  “Is  N  composite?”  lies  in  AfV  as  one  can  give  a  non-trivial  prime  factor  as  a 
witness.  This  witness  can  be  checked  in  polynomial  time  since  division  can  be  performed 
in  polynomial  time. 


1  Of  course  efficiency  in  practice  may  be  different  as  the  polynomial  bounding  the  run  time  could  have  high 
degree. 
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•  The  problem  “Is  G  /c-colourable?”  lies  in  AfP  since  as  a  witness  one  can  give  the  colouring. 

•  The  problem  “Does  this  knapsack  problem  have  a  solution?”  lies  in  AfP  since  as  a  witness 
we  can  give  the  values  b{. 

Note  that  in  these  examples  we  do  not  assume  the  witness  itself  can  be  computed  in  polynomial 
time,  only  that  the  witness  can  be  checked  in  polynomial  time.  Note  that  we  trivially  have 

V  C  AfP. 

The  main  open  problem  in  theoretical  computer  science  is  the  question:  Does  P  =  AfP2.  Most 
people  believe  in  the  following  conjecture. 

Conjecture  17.3. 

P^AfP. 

The  class  co-AfP  is  the  set  of  problems  for  which  a  witness  exists  for  every  instance  with  a  no 
response  which  can  be  checked  in  polynomial  time.  It  is  conjectured  that  A fV  7^  co-A fV,  and  if  a 
problem  lies  in  AfPDco-AfP  then  this  is  seen  as  evidence  that  the  problem  cannot  be  AfP -complete. 
For  example  the  problem  of  determining  whether  a  number  n  has  a  prime  factor  less  than  m  can 
be  shown  to  lie  in  A fV  H  co-A fV,  thus  we  do  not  think  that  this  problem  is  AGP-complete.  In  other 
words,  factoring  is  not  that  hard  a  problem  from  a  complexity-theoretic  point  of  view. 

One  can  consider  trying  to  see  how  small  a  witness  for  being  in  class  A fV  can  be.  For  example 
consider  the  problem  COMPOSITES.  Namely,  given  N  G  Z  determine  whether  N  is  composite. 
As  we  remarked  earlier  this  clearly  lies  in  class  ATP.  But  a  number  N  can  be  proved  composite  in 
the  following  ways: 

•  Giving  a  factor.  In  this  case  the  size  of  the  witness  is  0(logN). 

•  Giving  a  Miller-Rabin  witness  a.  Now,  assuming  the  Generalized  Riemann  Hypothesis 
(GRH)  the  size  of  a  witness  can  be  bounded  by  0(log  log  N)  since  we  have  a  <  0(( log  A)2). 

A  decision  problem  VP,  in  the  class  AfP,  is  said  to  be  AGP-complete  if  every  other  problem  in 
class  A fV  can  be  reduced  to  this  problem  in  polynomial  time.  In  other  words  we  have:  if  VP  is 
AGP-complete  then 

VP  G  P  implies  P  =  ATP. 

In  some  sense  the  AGP-complete  problems  are  the  hardest  problems  for  which  it  is  feasible  to  ask 
for  a  solution.  There  are  a  huge  number  of  AGP-complete  problems  of  which  two  will  interest  us: 

•  the  3-colouring  problem, 

•  the  knapsack  problem. 

A  problem  VP  is  called  AGP-hard  if  we  have  that  VP  G  P  implies  P  =  A fP,  but  we  do  not  know 
whether  VP  G  ATP.  Thus  an  A?7^-hard  problem  is  a  problem  which  is  as  hard  as  every  problem  in 
A fP,  and  could  be  even  harder. 

17.1.3.  Average  vs  Worst-Case  Complexity:  It  is  a  widely  held  view  that  all  the  standard 
hard  problems  on  which  cryptography  is  based,  e.g.  factoring,  discrete  logarithms  etc.,  are  not 
equivalent  to  an  ATP- complete  problem  even  though  they  he  in  class  ATP.  From  this  we  can 
conclude  that  factoring,  discrete  logarithms  etc.  are  not  very  difficult  problems  at  all,  at  least 
not  compared  with  the  knapsack  problem  or  the  3-colouring  problem.  So  why  do  we  not  use  AfP- 
complete  problems  on  which  to  base  our  cryptographic  schemes?  These  are,  after  all,  a  well-studied 
set  of  problems  for  which  we  do  not  expect  there  ever  to  be  an  efficient  solution. 

However,  this  approach  has  had  a  bad  track  record,  as  we  shall  show  later  when  we  consider 
the  knapsack-based  system  of  Merkle  and  Heilman.  For  now  we  simply  mention  that  the  theory 
of  AGP-completeness  is  about  worst  case  complexity.  But  for  cryptography  we  want  a  problem 
which,  for  suitably  chosen  parameters,  is  hard  on  average.  It  turns  out  that  the  knapsack  problems 
that  have  in  the  past  been  proposed  for  use  in  cryptography  are  always  “average”  and  efficient 
algorithms  can  always  be  found  to  solve  them. 
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We  illustrate  this  difference  between  hard  and  average  problems  using  the  ^-colouring  problem, 
when  k  =  3.  Although  determining  whether  a  graph  is  3-colourable  is  in  general  (in  the  worst 
case)  AfV -complete,  it  is  very  easy  on  average.  This  is  because  the  average  graph,  no  matter  how 
large  it  is,  will  not  be  3-colourable.  In  fact,  for  almost  all  input  graphs  the  following  algorithm  will 
terminate  saying  that  the  graph  is  not  3-colourable  in  a  constant  number  of  iterations. 

•  Take  a  graph  G  and  order  the  vertices  in  any  order  . . . ,  Vf. 

•  Call  the  colours  {1,  2,  3}. 

•  Now  traverse  the  graph  in  the  order  of  the  vertices  just  decided. 

•  On  visiting  a  new  vertex  select  the  smallest  possible  colour  (i.e.  one  from  the  set  {1,  2,  3}) 
which  does  not  appear  as  the  colour  of  an  adjacent  vertex.  Select  this  as  the  colour  of  the 
current  vertex. 

•  If  you  get  stuck  (i.e.  no  such  colour  exists)  traverse  back  up  the  graph  to  the  most  recently 
coloured  vertex  and  use  the  next  colour  available,  then  continue  down  again. 

•  If  at  any  point  you  run  out  of  colours  for  the  first  vertex  then  terminate  and  say  the  graph 
is  not  3-colourable. 

•  If  you  are  able  to  colour  the  last  vertex  then  terminate  and  output  that  the  graph  is 
3-colourable. 

The  interesting  thing  about  the  above  algorithm  is  that  it  can  be  shown  that  for  a  random  graph 
of  t  vertices  the  average  number  of  vertices  travelled  in  the  algorithm  is  less  than  197  regardless  of 
the  number  of  vertices  t  in  the  graph. 

Now  consider  the  case  of  factoring.  Factoring  is  actually  easy  on  average,  since  if  you  give  me 
a  large  number  I  can  usually  find  a  factor  quite  quickly.  Indeed  for  fifty  percent  of  large  numbers  I 
can  find  a  factor  by  just  examining  one  digit  of  the  number;  after  all  fifty  percent  of  large  numbers 
are  even!  When  we  consider  factoring  in  cryptography  we  implicitly  mean  that  we  construct  a 
hard  instance  by  multiplying  two  large  primes  together.  So  could  such  an  example  work  for  AfV- 
complete  problems?  That  is  can  we  quickly  write  down  an  instance  which  we  know  to  be  one  of  the 
hardest  instances  of  such  a  problem?  Think  of  the  graph  colouring  example.  How  can  we  construct 
a  graph  which  is  hard  to  colour,  but  is  3-colourable?  This  seems  really  difficult.  Later  we  shall  see 
that  for  some  hard  problems  related  to  lattices,  we  can  choose  parameters  that  make  the  average 
case  as  difficult  as  the  worst  case  of  another  (related)  hard  lattice  problem. 

17.1.4.  Random  Self-reductions:  So  we  know  that  we  should  select  as  hard  a  problem  as 
possible  on  which  to  base  our  cryptographic  security.  For  our  public  key  schemes  we  pick  parameters 
to  avoid  the  obvious  attacks  based  on  factoring  and  discrete  logarithms  from  Chapters  2  and  3. 
However,  we  also  know  that  we  do  not  actually  base  many  cryptographic  schemes  on  factoring  or 
discrete  logarithms  per  se.  We  actually  use  the  RSA  problem  or  the  DDH  problem.  So  we  need  to 
determine  whether  these  problems  are  hard  on  average,  and  in  what  sense. 

For  example  given  an  RSA  modulus  N  and  a  public  exponent  e  it  might  be  hard  to  solve 

c  =  rri  (mod  N) 

for  a  specific  c  in  the  worst  case,  but  it  could  be  easy  on  average  for  most  values  of  c.  It  turns  out 
that  one  can  prove  that  problems  such  as  the  RSA  problem  for  a  fixed  modulus  N  or  DDH  for  a 
fixed  group  G  are  hard  on  average.  The  technique  to  do  this  is  based  on  a  random  self-reduction 
from  one  given  problem  instance  to  another  random  problem  instance  of  the  same  problem.  This 
means  that  if  we  can  solve  the  problem  on  average  then  we  can  solve  the  problem  in  the  worst  case, 
since  if  we  had  a  worst-case  problem,  we  could  randomize  it  until  we  hit  upon  an  easy  average-case 
problem.  Hence,  the  worst-case  behaviour  of  the  problem  and  the  average-case  behaviour  of  the 
problem  must  be  similar. 

Lemma  17.4.  The  RSA  problem  is  random  self -reducible. 
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Proof.  Suppose  we  are  given  c  and  are  asked  to  solve  c  =  me  (mod  TV),  where  the  idea  is  that 
this  is  a  “hard”  problem  instance.  We  reduce  this  to  an  “average”  problem  instance  by  choosing 
s  <—  (Z/TVZ)*  at  random  and  setting  c'  <—  sec.  We  then  try  to  solve 

c  =  m'e  (mod  TV). 

If  we  are  unsuccessful  we  choose  another  value  of  s  until  we  hit  the  “average”  type  problem.  If 
the  average  case  was  easy  then  we  could  solve  d  =  m'e  (mod  N)  for  m'  and  then  set  m  <—  and 
terminate.  □ 

One  can  also  show  that  the  DDH  problem  is  random  self-reducible,  in  the  sense  that  testing  whether 
(A,  B ,  C)  =  (ga ,  gb ,  gc )  is  a  valid  Difhe-Hellman  triple,  i.e.  whether  c  =  a  •  6,  does  not  depend  on  the 
particular  choices  of  a,  b  and  c.  To  see  this  consider  the  related  triple  (A/,  B' ,  C ')  =  ( ga  ,gb,gc)  = 
(AU,BV,CU'V)  for  random  u  and  v.  Now  if  (A,  B,  C)  is  a  valid  Difhe-Hellman  triplet  then  so  is 
(A7,  B\  C 7),  and  vice  versa. 

One  can  show  that  the  distribution  of  (A',  B\Cf)  will  be  uniform  over  all  valid  Difhe-Hellman 
triples  if  the  original  triple  is  a  valid  Difhe-Hellman  triple,  whilst  the  distribution  will  be  uniform 
over  all  triples  (and  not  just  Difhe-Hellman  ones)  in  the  case  where  the  original  triple  was  not  a 
valid  Difhe-Hellman  triple. 
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The  idea  of  using  known  complexity-theoretic  hard  problems,  as  opposed  to  number-theoretic  ones, 
has  been  around  as  long  as  public  key  cryptography.  This  topic  has  become  more  important  in 
recent  years  as  all  of  our  existing  number-theoretic  constructions  for  public  key  cryptography  would 
become  broken  if  anyone  built  a  quantum  computer.  Thus  the  hunt  is  on  for  secure  and  efficient 
public  key  primitives  which  are  secure  against  any  future  quantum  computer. 

One  of  the  earliest  such  public  key  cryptosystems  was  based  on  the  knapsack  or  subset  sum 
problem,  which  is  WP- complete.  However  it  turns  out  that  this  knapsack-based  scheme,  and  almost 
all  others,  can  be  shown  to  be  insecure,  as  we  shall  now  explain.  The  idea  is  to  create  two  problem 
instances,  a  public  one  which  is  hard,  which  is  believed  to  be  a  general  knapsack  problem,  and 
a  private  problem  which  is  easy.  In  addition  there  should  be  some  private  trapdoor  information 
which  transforms  the  hard  problem  into  the  easy  one. 

This  is  rather  like  the  use  of  the  RSA  problem.  It  is  hard  to  extract  eth  roots  modulo  a 
composite  number,  but  easy  to  extract  eth  roots  modulo  a  prime  number.  Knowing  the  trapdoor 
information,  namely  the  factorization  of  the  RSA  modulus,  allows  us  to  transform  the  hard  problem 
into  the  easy  problem.  However,  the  crucial  difference  is  that  whilst  producing  integers  which  are 
hard  to  factor  is  easy,  it  is  difficult  to  produce  knapsack  problems  which  are  hard  on  average.  This 
is  despite  the  general  knapsack  problem  being  considered  harder  than  the  general  factorization 
problem. 

Whilst  the  general  knapsack  problem  is  hard  there  is  a  particularly  easy  set  of  problems  based 
on  super-increasing  knapsacks.  A  super-increasing  knapsack  problem  is  one  where  the  weights  are 
such  that  each  one  is  greater  than  the  sum  of  the  preceding  ones,  i.e. 

3- 1 

w3  >  E  W{. 

i= 1 

As  an  example  one  could  take  the  set 


{2,3,6,13,27,  52} 


or  one  could  take 


{1,2,4,8,16,32,64,...}. 
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Given  a  super-increasing  knapsack  problem,  namely  an  ordered  set  of  such  super-increasing  weights 
{kr,  . . . ,  icn}  and  a  target  weight  S,  determining  which  weights  to  put  in  the  sack  is  a  linear 
operation,  as  can  be  seen  from  Algorithm  17.2. 


Algorithm  17.2:  Solving  a  super-increasing  knapsack  problem 

for  i  =  n  downto  1  do 
if  S  >=  w-t  then 
bi  1. 

S  i  S  —  W{. 

else 

|_  bi  =  0. 

if  S  =  0  then  return  (£>i,  62, bn). 
else  return  (“No  Solution”). 


Key  Generation:  The  Merkle-Hellman  encryption  scheme  takes  as  a  private  key  a  super-increasing 
knapsack  problem  and  from  this  creates  (using  a  private  transform)  a  so-called  “hard  knapsack” 
problem.  This  hard  problem  is  then  the  public  key.  This  transform  is  achieved  by  choosing  two 
private  integers  A  and  M,  such  that 

gcd(A,  M)  =  1 

and  multiplying  all  values  of  the  super-increasing  sequence  by  A  (mod  M).  For  example  if  we  take 
as  the  private  key 

•  the  super-increasing  knapsack  {2,  3,  6, 13,  27,  52}, 

•  A  =  31  and  M  =  105. 

Then  the  associated  public  key  is  given  by  the  “hard”  knapsack 

{62,93,81,88,102,37}. 

We  then  publish  the  hard  knapsack  problem  as  our  public  key,  with  the  idea  that  only  someone 
who  knows  A  and  M  can  transform  back  to  the  easy  super-increasing  knapsack. 

Encryption:  For  Bob  to  encrypt  a  message  to  us,  he  first  breaks  the  plaintext  into  blocks  the  size 
of  the  weight  set.  The  ciphertext  is  then  the  sum  of  the  weights  where  a  bit  is  set.  So  for  example 
if  the  message  is  given  by 

Message  =  011000  110101  101110 

Bob  obtains,  since  our  public  knapsack  is  {62,  93,  81,  88,  102,  37},  that  the  ciphertext  is 

174,  280,  333, 

since 

•  011000  corresponds  to  93  +  81  =  174, 

•  110101  corresponds  to  62  +  93  +  88  +  37  =  280, 

•  101110  corresponds  to  62  +  81  +  88  +  102  =  333. 

Decryption:  The  legitimate  recipient  knows  the  private  key  A,  M  and  {2,  3,  6, 13,  27,  52}.  Hence, 
by  multiplying  each  ciphertext  block  by  A-1  (mod  M )  the  hard  knapsack  is  transformed  into 
the  easy  knapsack  problem.  In  our  case  A-1  =  61  (mod  M),  and  so  the  decryptor  performs  the 
operations 

•  174-61  =  9  =  3  +  6  =  011000, 

•  280  •  61  =  70  =  2  +  3  +  13  +  52  =  110101, 

•  333  •  61  =  48  =  2  +  6  +  13  +  27  =  101110. 
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The  final  decoding  is  done  using  the  simple  easy  knapsack, 

{2,3,6,13,27,52}, 

and  our  earlier  linear-time  algorithm  for  super-increasing  knapsack  problems. 


Our  example  knapsack  problem  of  six  items  is  too  small;  typically  one  would  have  at  least  250 
items.  The  values  of  N  and  M  should  also  be  around  400  bits.  However,  even  with  parameters 
as  large  as  these,  the  above  Merkle-Hellman  encryption  scheme  can  be  broken  using  lattice-based 
techniques:  Suppose  we  wish  to  solve  the  knapsack  problem  given  by  the  weights  {w\, . . . ,  wn}  and 
the  target  S.  Consider  the  lattice  L  of  dimension  n  +  1  generated  by  columns  of  the  following 
matrix: 


A  = 


(  10  0 
0  1  0 
0  0  1 

0  0  0 


0 

0 

0 

1 


\  \ 


1 

2 

1 

2 


1 

2 


\  W 1  W‘2  ...  wn  S  ) 

Now,  since  we  are  assuming  there  is  a  solution  to  our  knapsack  problem,  given  by  the  bit  vector 
(&i, . . . ,  bn),  we  know  that  the  vector  y  =  A  •  x  is  in  our  lattice,  where  x  =  (6i, . . . ,  bn,  —  1).  But 
the  components  of  y  are  given  by 


Vi 


b{  —  \  1  <  i  <  n 

0  i  =  n  +  1. 


Hence,  the  vector  y  is  very  short,  since  it  has  length  bounded  by 


=  \J J/l2  H - HhW  < 


n 


If  {w i, . . . ,  wn}  is  a  set  of  knapsack  weights  then  we  define  the  density  of  the  knapsack  to  be 


d  = 


n 


max{log2(Ry)  :  1  <  i  <  n} 


One  can  show  that  a  knapsack  with  low  density  will  be  easy  to  solve  using  lattice  basis  reduction. 
A  low-density  knapsack  will  usually  result  in  a  lattice  with  relatively  large  discriminant,  hence  the 
vector  y  is  exceptionally  short  in  the  lattice.  If  we  now  apply  the  LLL  algorithm  to  the  matrix 
A  we  obtain  a  new  basis  matrix  A! .  The  first  basis  vector  a}  of  this  LLL-reduced  basis  is  then 
likely  to  be  the  smallest  vector  in  the  lattice  and  so  we  are  likely  to  have  a}  =  y.  But  given  y 
we  can  then  solve  for  x  and  recover  the  solution  to  the  original  knapsack  problem.  This  allows  us 
to  break  the  Merkle-Hellman  scheme,  as  the  Merkle-Hellman  construction  will  always  produce  a 
low-density  public  knapsack. 


Example:  As  an  example  we  take  our  earlier  knapsack  problem  of 


bx  •  62  +  b2  •  93  +  b3  •  81  +  b4  •  88  +  b5  •  102  +  b6  •  37  =  174. 
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We  form  the  matrix 


A 


(  1  0 
0  1 

0  0 

0  0 

0  0 

0  0 

V  62  93 


0  0  0  0  \  \ 

0  0  0  0  \ 

1  0  0  0  \ 

0  1  0  0  \ 

0  0  1  0  \ 

0  0  0  1  \ 

81  88  102  37  174  / 


We  apply  the  LLL  algorithm  to  this  matrix  so  as  to  obtain  the  new  lattice  basis. 


We  write 


1 

-1 

-1 

CO 

-1 

-1  - 

1 

-1  - 

1 

-1 

1 

1 

0 

0 

y 


1 

2 


V 

and  compute 


x  = 


'V 


So  we  see  that  we  can  take  (fri,  ^2,^3,  &4,  b$,bo) 
problem. 


2  3  2  0  \ 

-2  -1  -2  0 

2-120 
0  -1  -2  -2  . 

2-304 
-2120 
0  0  -2  2  / 

1 

1 

1  , 

1 

1 

0 

(  °\ 

-1 

-1 

0  . 

0 

0 

V  1/ 

(0, 1, 1,  0, 0,  0),  as  a  solution  to  our  knapsack 


17.3.  Worst-Case  to  Average-Case  Reductions 

If  we  are  going  to  get  any  further  using  complexity-theoretic  hard  problems  we  are  going  to  need 
a  way  of  ensuring  that  we  are  choosing  instances  which  are  from  the  set  of  worst  cases.  Just  as  we 
did  for  the  RSA  and  DDH  problems  above,  the  way  we  do  this  is  to  provide  a  reduction  from  a 
worst-case  problem  to  an  average-case  problem.  However,  unlike  the  above  two  number-theoretic 
cases  our  reduction  will  not  be  to  the  same  problem,  but  from  an  instance  of  a  worst-case  problem 
of  type  X  to  an  average-case  instance  of  a  problem  of  type  Y . 
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The  goal  is  to  show  that  problem  Y  is  always  hard,  no  matter  how  we  set  it  up.  So  we  want 
to  show  that  Y  is  hard  on  average.  We  thus  take  a  problem  X  which  we  believe  to  be  hard  in  the 
worst  case,  and  then  we  show  how  to  solve  any  such  instance  (in  particular  a  worst-case  instance) 
using  an  oracle  which  solves  an  average  case  instance  of  problem  Y.  The  key  insight  is  that  if  we 
do  not  think  an  algorithm  for  the  worst  case  of  problem  X  can  exist,  then  this  leads  us  to  conclude 
there  cannot  be  an  algorithm  to  solve  the  average  case  of  problem  Y. 

We  will  only  vaguely  sketch  the  ideas  related  to  these  concepts  in  this  section,  as  the  technical 
details  are  rather  involved.  Let  us  concentrate  on  a  lattice-based  problem,  and  let  B  denote  the 
input  public  basis  matrix  of  some  hard  problem  (for  example  we  want  to  solve  the  shortest  vector 
problem  in  the  lattice  generated  by  the  columns  of  B).  Now  B  may  not  generate  a  “random  lattice” , 
in  fact  it  could  be  a  lattice  for  which  the  complexity  of  solving  the  SVP  problem  is  at  its  worst. 
From  B  we  want  to  construct  another  lattice  problem  which  “looks  average” . 

The  key  idea  is  that  we  want  to  generate  a  lattice  basis  A,  for  a  new  lattice,  such  that  the 
new  lattice  basis  is  “related”  to  the  basis  B  (so  we  can  map  solutions  to  a  problem  in  the  lattice 
generated  by  A  into  a  solution  to  a  problem  in  the  lattice  generated  by  B),  but  that  the  lattice 
generated  by  A  is  “uniformly  random”  in  some  sense.  To  generate  a  random  lattice  we  just  need  to 
pick  vectors  at  random,  and  take  the  lattice  generated  by  the  vectors.  So  we  need  some  process  to 
generate  random  vectors,  which  are  not  “too”  random,  but  are  random  enough  to  fool  an  algorithm 
which  solves  average-case  problems. 

Consider  the  lattice  C(B)  and  pick  a  random  vector  x  in  the  lattice;  do  not  worry  how  this  is 
done  since  C(B)  has  infinitely  many  vectors.  Now  pick  a  non-lattice  vector  close  to  x,  in  particular 
pick  an  error  vector  e  chosen  with  a  Gaussian  distribution,  with  mean  zero  and  some  “small” 
standard  deviation.  See  Figure  17.1  for  an  example  of  the  resulting  probability  distribution  for  a 

r\ 

two-dimensional  lattice  .  Now  make  the  standard  deviation  get  larger  and  larger,  and  eventually 
we  will  get  a  distribution  which  “looks”  uniform  but  which  is  related  to  the  original  lattice  basis 
B\  see  Figure  17.2. 


100 


Figure  17.1.  Perturbing  a  lattice  by  a  Gaussian 


We  shall  use  this  technique  to  give  a  flavour  of  how  one  can  use  an  oracle  to  solve  the  small 
integer  solution  problem  in  the  average  case ,  so  as  to  solve  any  approximate  shortest  vector  problem, 
i.e.  possibly  in  the  worst  case.  At  this  point  you  may  want  to  go  back  to  the  discussion  around 
Definitions  5.4  and  5.8  to  recap  some  basic  facts. 


JI  thank  Oded  Regev  and  Noah  Stephens-Davidowitz  for  the  use  of  their  pictures  here. 
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Figure  17.2.  Increasing  the  Gaussian’s  standard  deviation 


Recall  that  for  a  matrix  A  E  (Z/gZ)nxm  the  SIS  problem  is  to  find  a  short  vector  z  E  Zm  such 
that  A  •  z  =  0  (mod  q ),  where  we  shall  assume  that  q  is  a  prime.  Let  [3  denote  the  bound  on  the 
norm  of  z  for  the  specific  SIS  instance.  We  will  let  0sis(A  B  P)  denote  our  oracle  which  solves  SIS 
in  the  average  case. 

We  remarked,  in  Chapter  5,  that  solving  SIS  was  equivalent  to  finding  an  equivalently  short 
vector  in  the  lattice 

Aq(A)  =  {z  E  Zm  :  A  •  z  =  0  (mod  q)}  . 

The  lattice  A^(H)  has  determinant  gn,  and  hence  if  we  select  [3  =  •  qn/m  then  the  SIS  oracle 

C>Sis(A,  g,  f3 )  is  guaranteed  to  output  a  solution.  We  want  to  use  this  oracle  to  solve  any  approximate 
shortest  vector  problem.  In  particular  this  means  that  if  we  can  find  short  vectors  in  the  lattice 
A^(EL)  for  a  matrix  A  chosen  uniformly  at  random  from  (Z/gZ)nxm  then  we  can  find  short  vectors 
in  any  lattice.  To  do  this  we  will  go  via  another  problem,  related  to  the  shortest  vector  problem, 
namely  the  shortest  independent  vectors  problem. 


Definition  17.5  (Shortest  Independent  Vectors  Problem  (SIVP)).  Given  a  lattice  L  and  an  integer 
7  the  SIVP  problem  SIVP7  is  to  find  a  basis  B  of  a  full-rank  sub-lattice  such  that  the  length  of  all 
vectors  in  B  is  less  than  7  •  A n(L),  where  Xn(L)  is  the  nth  successive  minimum  of  the  lattice. 

Recall  the  transference  theorem  of  Banaszczyk  from  Chapter  5:  this  stated  that  for  all  n-dimensional 
lattices  we  have 

1  <  Ai(L)  •  An(L*)  <  n. 

Hence,  if  we  can  approximate  the  nth  successive  minimum  of  a  lattice,  which  is  what  a  solution  to 
the  SIVP7  problem  does,  then  we  can  approximate  the  shortest  vector  problem  of  its  dual,  and  vice 
versa. 

So  we  now  restrict  ourselves  to  considering  the  SIVP7  problem,  which  we  will  solve  for  7  = 
16  •  n  •  yjm  •  C2,  for  a  constant  C2  defined  below.  We  let  B  E  Znxn  denote  the  input  matrix  to  our 


SIVP7  problem,  consisting  of  the  n  column  vectors  {bi, . . .  ,bn}.  We  let  ||F>||  =  max^  ||R||  and  our 
goal  is  to  find  a  lattice  basis  with  ||F>||  <  7  •  A n(L).  We  will  do  this  by  successively  replacing  B  by 
a  basis  B'  with  ||iT||  <  ||  B  ||/2.  This  is  done  by  finding  a  vector  y  in  the  lattice  of  size  bounded  by 
| B ||/2.  Then,  if  the  output  vector  y  is  not  independent  of  the  largest  vector  in  5,  we  can  replace 
the  largest  vector  in  B  with  the  new  vector.  Thus  we  obtain  a  basis  with  a  smaller  value  of  ||F>|| 
and  we  can  repeat.  It  turns  out  that  the  probability  that  the  output  vector  is  not  independent  of 
the  largest  vector  in  B  is  very  small. 

Our  method  proceeds  as  follows: 

•  Pick  m  <—  ci  •  n  •  log  g,  for  some  constant  0  <  c\  <  1.  We  will  not  worry  about  the  precise 
details/constants  in  this  overview.  We  are  just  giving  the  basic  ideas.  With  this  choice  of 
m  we  have  that  qn/m  =  exp(l/ci)  =  C2. 


r  / 1 


m  •  C2- 
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For  1  <  i  <  m  perform  the  following  steps: 

—  Pick  for  1  <  i  <  m  to  be  n-dimensional  vectors  with  coefficients  picked  from  an 
n-dimensional  Gaussian  distribution  with  standard  deviation  a  =  y/n  •  ||F>||/(2  •  q). 

—  Set  vi  i—  (e^  (mod  B))  —  e{ ,  where  (mod  B)  means  reduce  into  the  fundamental 

parallelepiped  defined  by  the  basis  B.  Note  that  G  L(B). 

—  Now  set  it  (e^  (mod  B))  —  ^  •  B  •  for  some  integer  vector  a by  rounding  (e^ 


A 


(mod  B))  to  the  nearest  lattice  point  of  L(^  •  B) 


z 


[ai , ,  a m 

Os\s(A,q,P). 

E2?  •  (e*  +  Ui). 

•  Output  y. 

We  need  to  show  that  y  is  indeed  a  lattice  vector,  that  it  has  norm  less  than  ||F>||/2,  and  that  the 
SIS  oracle  Os\s(A,q,P)  will  actually  work  as  expected.  Let  us  look  at  these  three  points  in  turn: 

•  y  G  L(B):  Recall  that  the  vector  z  output  by  the  0sis(A  P)  oracle  will  satisfy  A  •  z  =  0 
(mod  q).  Thus  there  is  a  w  G  Zn  with  A  •  z  =  q  •  w. 


B 


w 


1  11  _ ^ 

-  •  B  •  (q  •  w)  =  —  -  B-  A-  z=  —  -  B-  V 

q  q  q 


a  i  •  Zi 


'y  (ej  +  Ui  +  Vi)  Zj. 


Thus,  since  B  •  w  is  a  lattice  vector,  and  so  is  v^,  we  have  that  y  must  also  be  a  lattice 
vector. 

||y||  <  1 1 5 1 1/2:  We  combine  two  facts  to  achieve  this  bound: 

(1)  First  note  that  the  tuple  (a^,  e^,  u^,  v^)  satisfies  - 


l 

9 


B  •  a^,  and  that 


since  the  maximum  length  of  a  basis  vector  of  L(^  •  B)  is  bounded  by  ||R||/g,  we  have 


IT 


<  n  •  ||R||/g. 


(2)  Secondly,  since  has  been  chosen  from  an  n-dimensional  Gaussian  with  standard 


deviation  a  we  have  that,  with  very  high  probability" 


e; 


<  2 


n. 


Combining  these  two  facts  together,  and  using  the  fact  that  a  =  y/n  •  ||F>||/(2  •  g),  we 


find  that  ||e^  +  it 
<  /?,  to  show 


<  2  •  n  •  1 1 B 1 1 /q.  Then  we  use  that  the  output  of  Osis(^.5  <L  P)  satishes 


z 


5^(ei  +  -  zi ||  <  ^2  || ei  +  u 


Zi 


<  \fn 


z 


2  •  n  •  || B 

q 


<  n 


1.5 


P 


B 


< 


B 


q 


The  last  inequality  follows  by  our  choice  of  p. 

The  SIS  oracle  0sis(A  q 5  P)  is  “well  behaved” :  By  choice  of  P  there  is  a  solution  to  the  SIS 
problem  for  the  input  matrix  A,  so  we  only  require  that  the  oracle  finds  a  solution.  The 
SIS  oracle  will  be  well  behaved,  by  assumption,  if  the  input  is  a  random  SIS  instance,  i.e. 
the  matrix  A  looks  like  it  was  selected  at  random  from  the  set  of  all  matrices  modulo  q. 
This  is  where  the  complicated  analysis  comes  in.  The  basic  idea  is  that  since  the  standard 
deviation  a  is  chosen  to  be  relatively  large  compared  to  the  lattice  basis  F>,  the  statistical 
distance  between  the  vectors  modulo  L(B)  and  the  uniform  distribution  of  Mn  modulo 
L(B)  is  very  small.  Then  since  a i  is  simply  the  rounding  of  these  values  to  a  grid  point, 
they  too  act  like  uniform  random  vectors.  We  also  need  to  show  that  the  new  vector  is 
linearly  independent  of  the  other  vectors,  again  something  which  we  skip  over  here. 

So  how  large  must  a  be?  The  analysis  says  that  all  will  be  fine  with  the  SIS  oracle  if 


we  take  a  = 


n 


B\\/(2-q)  >  2  •  A n(T).  But  this  will  imply  that  our  input  matrix  B 


o 

In  fact  exponentially  high  probability,  the  estimate  being  obtained  from  examining  the  tail  of  the  Gaussian 
distribution. 
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must  satisfy  ||Z>||  >  4  •  g  •  A n(L)/y/n  =  7  •  A n(L).  Thus  our  reduction  will  work  as  long 
as  the  basis  satisfies  this  condition.  Hence  the  reduction  will  stop  as  soon  as  we  have 
B  ||  <  7  •  An(L),  i.e.  as  soon  as  we  have  solved  the  input  SIVP7  problem. 


17.4.  Learning  With  Errors  (LWE) 


Another  lattice-based  problem  with  a  similar  worst-case  to  average-case  connection  is  the  Learning 
With  Errors  (LWE)  problem.  An  oracle  to  solve  the  average-case  complexity  of  this  problem  can 
be  used  to  solve  the  Bounded-Distance  Decoding  (BDD)  problem  in  a  related  lattice,  although  the 
reduction  is  far  more  involved  than  even  our  sketch  of  the  SIS  to  SVP  reduction  above. 

To  define  the  LWE  problem  we  first  need  to  define  an  error  distribution,  which  we  shall  denote 
by  D^n?cr;  this  distribution  produces  n-dimensional  integer  vectors  whose  coefficients  are  distributed 
approximately  like  a  normal  distribution  with  mean  zero  and  standard  deviation  a.  One  should 
think  of  the  elements  output  with  high  probability  by  the  distribution  as  having  “small”  co¬ 

efficients.  We  have  two  LWE  problems:  the  search  and  decision  problems;  we  can  define  advantages 
for  adversaries  against  these  problems  in  the  usual  way. 

Definition  17.6  (LWE  Search  Problem).  Pick  s  (Z/gZ)m,e  and  A  <—  {Z/qZ)nxrn, 

where  n  >  m,  and  set  b  <—  A  •  s  +  e  (mod  q).  The  search  problem  is  given  the  pair  (A,  b)  to  output 
the  vector  s. 


Definition  17.7  (LWE  Decision  Problem).  Given  (A,  b)  where  A  G  (Z/qZ)nxm ,  where  n  >  m  and 
b  G  (Z/qZ)n;  determine  which  of  the  following  two  cases  holds: 

(1)  b  is  chosen  uniformly  at  random  from  (Z/qZ)n. 

(2)  b  A  •  s  +  e  (mod  q )  where  e  <—  and  s  (Z/gZ)m. 


Let  us  first  consider  the  search  problem.  Suppose  the  error  vector  e  was  zero,  we  would  then  be 
solving  the  linear  system  of  equations  A-s  =  b  (mod  g),  for  a  matrix  with  more  rows  than  columns. 
This  will  have  a  solution  if  the  set  of  equations  is  consistent,  which  it  will  be  by  our  construction. 
Finding  that  solution  just  requires  the  inversion  of  the  matrix,  and  is  hence  an  easy  computational 
problem.  So  simply  adding  a  small  error  vector  e  seems  to  make  the  problem  very  much  harder. 
In  some  sense  we  are  decoding  a  random  linear  code  over  Z/gZ,  where  the  error  vector  is  e. 

We  can  relate  the  search  problem  to  our  g-ary  lattices  from  Chapter  5,  where  we  let  L  =  A g(AT). 
We  first  note  that  recovering  e  is  equivalent  to  recovering  s.  The  LWE  problem  then  is  to  find  a 
vector  x  G  L  such  that  b  —  x  =  e.  So  to  turn  this  into  a  Bounded-Distance  Decoding  problem  we 
need  to  bound  the  size  of  the  error  vector  e.  However,  e  was  chosen  from  a  distribution  which  was 
a  discrete  approximation  to  the  Gaussian  distribution,  so  we  can  prove  that  its  norm  is  not  too 
large.  In  fact  one  can  show  that  with  very  high  probability  ||e||  <  2  •  a  •  yffiz/V 2  •  n.  Thus  we  wish 
to  solve  a  BDDa  problem  for  a  =  2  •  a  •  yffir/ (\/2  •  7 r  •  Ai(L)). 

Before  proceeding  we  state  some  basic  facts,  without  proof,  about  the  LWE  problem. 


•  The  decision  and  search  problems  are  equivalent,  i.e.  given  an  oracle  to  solve  the  decision 
problem  we  can  use  it  to  solve  the  search  problem. 

•  The  secret  vector  s  in  the  definition  can  be  selected  to  come  from  the  error  distribution 
Dz™,cr  with  no  reduction  in  hardness. 

•  Whilst  the  worst- case /average-case  hardness  result  requires  the  error  distribution  to  come 

from  in  practice  we  can  select  any  distribution  which  outputs  sufficiently  “small” 

and  high-entropy  vectors. 


17.4.1.  Public  Key  System  Based  on  LWE:  We  can  now  define  a  public  key  cryptosystem 
whose  hardness  is  closely  related  to  the  worst  case  complexity  of  the  Bounded-Distance  Decoding 
problem,  and  for  which  we  currently  do  not  know  of  any  attacks  which  can  be  mounted  with  a 
quantum  computer.  The  scheme  is  parametrized  by  integer  values  n,  m  and  g,  with  m  ~  n  •  log  q 
and  q  >  2  prime.  In  what  follows  for  the  rest  of  this  chapter  when  we  reduce  something  modulo  q 
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we  take  a  representative  in  the  range  {—q/ 2, . . . ,  q/ 2],  i.e.  we  take  the  set  of  representatives  centred 
around  zero. 


Key  Generation:  For  the  private  key  we  select  s  <—  (Z/ qZ)n .  To  create  the  public  key  we  generate 
m  vectors  a i  (Z/gZ)n,  and  nn  error  values  e*  <—  ia.  We  set  bi  <—  a^  •  s  —  2  •  ei  (mod  q)  and 
output  the  public  key  pt  <—  ((ai,  6i), . . . ,  (a m,  6m)). 


Encryption:  To  encrypt  a  message  (which  will  be  a  single  bit  m  G  {0, 1}),  the  encryptor  picks  a 
subset  of  indicator  bits  in  {1, . . . ,  ra},  i.e.  he  picks  o  ^  {0, 1}  for  i  =  1, . . . ,  m.  The  ciphertext  is 
then  the  pair  of  values  (consisting  of  a  vector  c  and  an  integer  d) 


m 


m 


c  ii  •  a i  and  d  <—  m  —  g  •  bi 


i— 1 


i=l 


Decryption:  Decryption  of  (c,  d)  is  performed  by  evaluating 


c  •  s  +  d  (mod  q)^j  (mod  2) 


m 


=  I  I  ti  •  3.i  •  s  —  ti  •  bi  ]  +  m  (mod  q)  )  (mod  2) 


2  =  1 
m 


2  •  g  *  )  +  R4  (mod  q )  j  (mod  2). 


.2  =  1 

=  (2  •  “small”  +  m)  (mod  2) 
=  m. 


by  dehnition  of  bi 


since  the  a  are  small 


Note  that  the  sum  rj  =  X^lLi  ^  ’  Li  '  ei  w ill  be  “small”  modulo  <7,  and  will  not  “wrap  around”  (i.e. 
exceed  g/2  in  absolute  value)  since  the  have  been  selected  suitably  small.  We  shall  call  the  value 
rj  the  “noise”  associated  to  the  ciphertext.  The  reason  decryption  works  is  that  this  noise  is  small 
relative  to  the  modulus  <7,  and  hence  when  we  are  taking  the  inner  modulo  q  operation  we  obtain 
the  value  of  77  +  m  over  the  integers.  The  reason  for  taking  reduction  centred  around  zero  is  to 
ensure  this  last  fact.  Since  77  is  divisible  by  two  we  can  recover  the  message  m  by  reducing  the 
result  modulo  two. 

The  general  design  of  the  scheme  is  that  the  public  key  contains  a  large  number  of  encryptions 
of  zero.  To  encrypt  a  message  m  we  take  a  subset  of  these  encryptions  of  zero  and  add  them  to  the 
message,  thus  obtaining  an  encryption  of  m,  assuming  the  “noise”  does  not  get  too  large.  It  can 
be  shown  that  the  above  scheme  is  IND-CPA  assuming  the  LWE  decision  problem  is  hard. 


17.4.2.  Ring- LWE:  The  problem  with  the  above  LWE-based  system  is  that  we  can  only  encrypt 
messages  in  {0, 1},  and  that  each  ciphertext  consists  of  a  vector  c  G  (Z/gZ)n  and  a  value  d  G  Z/gZ. 
The  public  key  is  also  very  large  consisting  of  2  •  m  vectors  of  length  n  and  2  •  m  integers  (all  of 
values  in  Z/qZ).  Thus  the  overhead  is  relatively  large.  To  reduce  this  overhead,  and  to  increase  the 
available  message  space,  it  is  common  to  use  a  variant  of  LWE  based  on  rings  of  polynomials,  called 
Ring-LWE.  Ring- LWE  based  constructions  are  also  more  efficient  from  a  computational  perspective. 

Recall  that  normal  LWE  is  about  matrix- vector  multiplications,  x  — >  A  •  x.  In  Ring-LWE  we 
take  “special”  matrices  A,  which  arise  from  polynomial  rings.  Consider  the  ring  of  polynomials 
R  =  Z[X\/F(X),  where  F(X)  is  an  integer  polynomial  of  degree  n.  The  ring  R  is  the  set  of 
polynomials  of  degree  less  than  n  with  integer  coefficients,  addition  being  standard  polynomial 
addition  and  multiplication  being  polynomial  multiplication  followed  by  reduction  modulo  F(X). 
As  a  shorthand  we  will  use  Rq  to  denote  the  ring  R  reduced  modulo  <7,  i.e.  the  ring  (Z / qZ)[X\ / F (X) . 

Now  there  is  a  link  between  polynomial  rings  and  matrix  rings,  since  we  can  think  of  a  poly¬ 
nomial  as  defining  a  vector  and  a  matrix.  In  particular  suppose  we  have  two  polynomials  a(X) 
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and  b(X)  in  the  ring  R ,  whose  product  in  R  is  c(X).  Clearly  we  can  think  of  b(X)  as  a  vector 
b,  where  we  take  the  coefficients  of  b(X)  as  the  vector  of  coefficients.  Similarly  for  c(X)  and  a 
vector  c.  Now  the  polynomial  a(X)  can  be  represented  by  an  n  x  n  matrix,  which  we  shall  call 
Ma,  such  that  c  =  Ma  •  b.  The  mapping  from  R  to  n  x  n  matrices  which  sends  a(X)  to  Ma  is  what 
mathematicians  call  the  matrix  representation  of  the  ring  R. 

With  this  interpretation  of  polynomials  as  vectors  and  matrices  we  can  try  to  translate  the  LWE 
problem  above  into  the  polynomial  ring  setting.  We  first  define  the  error  distribution  Dr^,  which 
now  outputs  polynomials  of  degree  less  than  n  whose  coefficients  are  distributed  approximately  like 
a  normal  distribution  with  mean  zero  and  standard  deviation  a.  Again  the  outputs  are  polynomials 
that  with  high  probability  have  “small”  coefficients.  We  let  Bq  denote  a  bound  on  the  absolute 
value  of  the  coefficients  sampled  by  the  distribution  Dr^;  actually  this  is  an  expected  bound  since 
the  distribution  could  sample  a  polynomial  with  huge  coefficients,  but  this  is  highly  unlikely.  Given 
this  distribution  we  can  define  two  Ring- LWE  problems,  again  one  search  and  one  decision. 

Definition  17.8  (Ring- LWE  Search  Problem).  Pick  a,s  <E  Rq  and  e  <—  Drj(T  and  set  b  <—  a  •  s  +  e 
(mod  q).  The  search  problem  is  given  the  pair  (a,b)  to  output  the  value  s. 

Definition  17.9  (Ring-LWE  Decision  Problem).  Given  (a,  b)  where  a,b  G  Rq  determine  which  of 
the  following  two  cases  holds: 

(1)  b  is  chosen  uniformly  at  random  from  Rq. 

(2)  b  <—  a  •  5  +  e  (mod  q)  where  e  <—  Drg  and  s  <—  Rq. 

We  note  that  the  value  5  can  be  selected  from  the  distribution  Dr ?cr,  and  in  the  public  key  scheme 
we  present  next  we  will  indeed  do  this.  The  reason  for  this  modification  will  become  apparent  later. 

17.4.3.  Public  Key  System  Based  on  Ring-LWE:  We  can  immediately  map  the  public  key 
LWE-based  system  over  to  the  Ring-LWE  setting.  However,  we  would  like  to  do  this  in  a  way 
which  also  reduces  the  size  of  the  public  key.  Recall  that  in  the  LWE  encryption  scheme,  the  public 
key  contained  a  large  number  of  encryptions  of  zero.  What  we  require  is  a  mechanism  to  come  up 
with  these  encryptions  of  zero  “on  the  fly”  without  needing  to  place  them  in  the  public  key. 


Key  Generation:  We  pick  coprime  integers  p  and  q  with  g,  and  a  ring  R  as  above,  plus  a 
standard  deviation  a.  The  “security”  will  depend  on  the  dimension  n  of  the  ring  R  and  the  size 
of  a  and  q.  We  do  not  go  into  these  details  in  this  book  as  the  discussion  gets  rather  complicated, 
rather  quickly.  The  set  {p,  q,  R,  a}  can  be  considered  the  domain  parameters,  similar  to  the  domain 
parameters  in  ElGamal  encryption.  To  generate  an  individual’s  public  key  and  the  private  key  we 
execute: 


•  s,e<-  DR)(T. 

•  a  i —  Rq . 

•  £>  Y-  a  -  s  -\-  p  -  e  (mod  q). 

•  pt  i —  ((7,  6),  St  i —  S. 

With  this  scheme  we  will  be  able  to  encrypt  arbitrary  elements  of  the  ring  Rpj  and  we  can  think 
of  the  public  key  as  a  random  encryption  of  the  element  zero  in  this  ring. 


Encryption:  To  encrypt  a  message  m  E  Rp  we  execute 

•  eo,  ei,  e2  <-  Dr^. 

•  co  <—  b  •  eo  +  p  •  e\  +  m. 

•  ci  a  •  eo  +  p  •  e2- 

The  ciphertext  is  then  (cq,ci). 
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Decryption:  To  decrypt  we  compute 
(c0  —  ci  •  s  (mod  q)^j  (mod  p) 

=  ^  (b  •  eo  +  p  •  e\  +  m)  —  (a  •  eo  +  p  •  e2)  •  s  (mod  q)^  (mod  p) 

=  •  s  •  eo  +  p  •  e  •  eo  +  p  •  ei  +  m 

—  a  •  eo  •  s  —  p  •  e2  •  s  (mod  q)^  (mod  p) 

=  (p  •  (e  •  eo  +  ei  —  e2  •  s)  +  nn  (mod  q)^  (mod  p) 

p  •  “small”  +  nn  (mod  q)^j  (mod  p) 
p  •  “small”  +  m^j  (mod  p) 


since  s,  e,  ei  are  small 


m. 


Another  way  to  think  of  decryption  is  to  take  the  vector  product  between  the  vector  (co,ci)  and 
the  vector  (1,  —  s),  a  viewpoint  which  we  shall  use  below. 

Notice  how  the  “noise”  associated  with  the  ciphertext  is  given  by 


Tj  =  e  •  eo  +  ei  —  e2  •  s 

and  that  this  is  small  since  we  picked  not  only  e,  eo,  e\  and  e2  from  the  error  distribution,  but  also 
5.  Also  notice  how  we  use  the  randomization  by  eo,ei  and  e2  to  produce  a  random  “encryption” 
of  zero  in  the  encryption  algorithm.  Thus  we  have  not  only  managed  to  improve  the  message 
capacity  from  {0, 1}  to  Rp ,  but  we  have  also  reduced  the  public  key  size  as  well.  It  can  be  shown 
that  the  above  scheme  is  IND-CPA  assuming  the  Ring- LWE  decision  problem  is  hard.  We  cannot 
make  the  scheme  as  it  stands  IND-CCA  secure  as  it  suffers  from  an  issue  of  malleability.  Indeed 
the  malleability  of  the  scheme  is  very  interesting  as  it  allows  us  to  construct  a  so-called  Fully 
Homomorphic  Encryption  (FHE)  scheme. 


17.4.4.  Fully  Homomorphic  Encryption:  Usually  a  scheme  being  malleable  is  a  bad  thing, 
however  in  some  situations  it  is  actually  useful.  For  example,  in  Chapter  21  we  shall  see  how  an 
additively  homomorphic  encryption  scheme  can  be  used  to  implement  a  secure  electronic  voting 
protocol.  We  have  seen  a  number  of  schemes  so  far  which  are  multiplicatively  homomorphic,  e.g. 
naive  RSA  encryption,  ElGamal  encryption  and  one  which  is  additively  homomorphic,  namely 
Paillier  encryption.  A  scheme  which  is  both  additively  and  multiplicatively  homomorphic  is  called 
fully  homomorphic. 

More  formally  let  R  be  a  ring  with  operations  (+,•),  and  let  E  and  D  denote  the  public  key 
encryption  and  decryption  operations,  which  encrypt  and  decrypt  elements  of  R.  So  we  have 

d(^  F(m,pl),  st  =  m. 

A  scheme  is  said  to  be  fully  homomorphic  if  there  are  two  functions  Add  and  Mult  which  each  take 
as  input  two  ciphertexts,  and  respectively  return  ciphertexts  as  output  such  that  for  all  mi,  m2  G  R 
we  have 

D  ^  Add  ^(mi,  pi),  F(m2,  pt)  ^ ^  =  mi  +  m 2, 

D  ^  Mult^F(mi,  pi),  E(m 2,  pi)  ^ ,  si  ^  =  mi  •  m2. 

To  see  why  a  fully  homomorphic  encryption  scheme  might  be  useful,  consider  the  following  situation. 
Suppose  Alice  encrypts  some  information  m  and  stores  the  ciphertext  c  on  a  remote  server.  Then 
later  on  Alice  wants  to  compute  some  function  F  on  the  message,  for  example  she  may  want  to 
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know  whether  the  message  was  an  email  from  Bob  saying  “I  love  you” .  It  would  appear  that  Alice 
needs  to  retrieve  the  ciphertext  from  the  server,  decrypt  it,  and  then  perform  the  calculation.  This 
might  not  be  convenient  (especially  in  the  case  when  the  message  is  a  huge  database,  say). 

With  a  fully  homomorphic  encryption  scheme  Alice  can  instead  send  the  function  F  to  the 
server,  and  the  server  can  then  compute  a  new  ciphertext  cp  which  is  the  encryption  of  F(m). 
Amazingly  this  can  be  done  without  the  server  ever  needing  to  obtain  the  message  m.  The  reason 
for  this  is  that  every  function  can  be  expressed  as  a  series  of  additions  and  multiplications  over  a 
ring,  and  hence  we  can  express  F  as  such  a  series.  Then  we  apply  the  addition  and  multiplication 
of  the  fully  homomorphic  encryption  scheme  to  obtain  a  ciphertext  which  decrypts  to  F(m). 

The  idea  of  constructing  a  fully  homomorphic  encryption  scheme  dates  back  to  the  early  days 
of  public  key  cryptography.  But  it  was  not  until  2009  that  even  a  theoretical  construction  was 
proposed,  by  Craig  Gentry.  Gentry’s  original  construction  was  highly  complicated,  but  now  we  can 
present  relatively  simple  (although  not  very  efficient)  schemes  based  on  the  LWE  and  Ring-LWE 
problems. 

Let  us  look  again  at  our  Ring-LWE  encryption  scheme.  All  we  need  look  at  is  what  a  ciphertext 
is;  we  can  ignore  the  encryption  procedure.  A  ciphertext  is  a  pair  c—  (co,ci)  such  that 

co  —  c\  •  5  =  m  +  p  •  T)c  (mod  q), 


where  r\c  is  the  noise  associated  with  the  ciphertext  c.  It  is  clear  that  this  encryption  scheme 
is  additively  homomorphic  to  some  extent,  since  we  have  for  two  ciphertexts  c  =  (co,ci)  and 
c'  =  (cq,  4)  encrypting  m  and  m'  that 


c0  +  d0J  -  (ci  +  CiJ  •  5  =  (cq  -  ci  •  s)  +  (co  -  c[  •  s) 

=  (jn  +  p  •  ^  m  +  p  •  rjc^  (mod  q) 

=  (jn  +  m'^j  +  p  •  ^ T]c  +  pcrj  (mod  q). 


In  particular  we  can  keep  adding  ciphertexts  together  componentwise  and  they  will  decrypt  to  the 
sum  of  the  associated  messages,  up  until  the  point  when  the  sum  of  the  noises  becomes  too  large, 
when  decryption  will  fail.  We  call  a  system  for  which  we  can  perform  a  limited  number  of  such 
additions  a  somewhat  additively  homomorphic  scheme. 

It  is  not  immediately  obvious  that  the  above  scheme  will  support  multiplication  of  ciphertexts. 
To  do  so  we  need  to  augment  the  public  key  a  little.  As  well  as  the  pair  (a,  b )  we  also  include 
in  the  public  key  an  encryption  of  pl  •  s2,  for  i  G  [0, . . . ,  |~log pq}\,  in  the  following  way.  We  pick 
a[  <—  Rq  and  e[  Dpc t  and  set  b\  <—  a'  •  s  p  •  e[  -\-pl  •  2  (mod  q).  Note  that  these  are  not  really 

encryptions  of  pl  •  s2,  since  pl  •  s 2  (mod  q)  is  not  even  an  element  of  the  plaintext  space  Rp.  Despite 
this  problem  the  following  method  will  work,  and  it  is  useful  to  think  of  (a',  b[)  as  an  encryption  of 
pl  •  s2 . 

Now  suppose  we  have  two  ciphertexts  c  =  (co,ci)  and  d  —  (cq,^)  encrypting  m  and  m' .  We 
first  form  the  tensor  product  ciphertext  c<S>d  =  (co-Cq,  co*c^,  ci*Cq,  c\ -c^)  =  (do,  di,  d2,  dd).  This 
four-dimensional  ciphertext  will  decrypt  with  respect  to  the  “secret  key”  vector  (1,  —s)  0(1,  —s)  = 
(1,  —5,  —5,  s2),  since 


do  —  di  •  5  —  d2  •  5  +  d3  •  s2  =  cq  •  Cq  —  cq  •  c1  •  5  —  ci  •  Cq  •  5  +  c\  •  cl  •  s2  (mod  q) 


CQ- Cl- s)  •  (  Cq  —  c1  •  s  )  (mod  q) 


(jn  +  p  •  •  (jn'  +  p  •  (mod  q) 

m  •  m!  +  p  •  (jn  •  77 c/  +  m  •  r)c  +  p  •  r)c  •  77 c/^  (mod  q ) 
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So  we  see  that  the  “ciphertext”  c<g>  c'  decrypts  to  the  product  of  m  and  m'  under  the  secret  key 
(1,  —s)  0(1,  — s )  assuming  the  noise  is  not  too  large.  Since  pCgwe  have  that  the  noise  grows 
roughly  as  p  •  pc  •  pc/.  The  problem  is  that  the  ciphertext  is  now  twice  as  large;  we  would  like  to 
reduce  to  a  ciphertext  with  two  components  which  decrypts  in  the  usual  way  via  the  secret  vector 
(!>-«)• 

To  achieve  this  goal  we  use  the  data  with  which  we  augmented  the  public  key,  namely  the 
encryption  (a',  b[)  of  pl  •  s2 .  Given  the  four-dimensional  ciphertext  (do,  di,  d2,  d3)  we  write  d3  in  its 
base-p  representation,  i.e. 

R°g  P  o\ 

d3  =  pl  ■  d3ti, 

1=0 

where  d3;^  G  Rp.  We  then  set 

riQg  P  o\ 

/o  do  +  ^ 

i= o 

T>g  P  q\ 

/i  di  +  d2  +  ^  d3^  •  a[. 

i= 0 

Then  the  ciphertext  /  =  (/o,  /i)  decrypts  to 

riogp  «i  ri°gp  «i 

/o  -  /i  •  S  =  (do  +  ^  d3>i  •  -  (di  +  d2  +  ^  d3ji  •  a- J  •  s  (mod  p) 


2=0 


2=0 


ri°gp 


I  —top  I 

do  -  di  •  5  -  d2  •  5  +  ^  (d3j*  •  s2  +  d3;*  •  a-  •  s  +  p  •  e •  •  d3?i  -  d3?i  •  a-  •  sj  (mod  </) 

2  =  0 


fiogp  g] 


I  —top  ^  I 

(do  —  di  •  5  —  d2  •  5  +  d3  •  «s2^  +  p  •  ^  e-  •  d3j*. 

2=0 


So  we  obtain  a  ciphertext  which  decrypts  to  m  •  m'  using  the  above  equation,  but  with  noise  term 

riogp 

7)f  =  m  •  r/cz  +  ml  •  r/c  +  p  •  pc  •  pc/  +  ^  e  •  •  d3ji. 


2=0 


We  need  to  estimate  the  size  of  r/j  as  a  polynomial  over  the  integers.  To  do  this  we  use  the 
norm  ||/||  which  returns  the  maximum  of  the  absolute  value  of  the  coefficients  of  /.  Suppose  the 


noise  is  bounded  by  Up, 
m,  m7,  d3^  G  we  have  that 
have 


Id/ 


Pc' II  —  B  f°r  fhe  Wo  ciphertexts  passed  into  the  multiplication.  Since 


m 


m 


d3,i||  <  p/2.  By  definition  we  have  ||ej||  <  Bq ,  and  so  we 


< 


P-B  ,  V  B  ,  d2  ,  flogp <71  •  ArP 

- - b  p  •  E>  H - - - 

2  2  2 

fl°g»  dl  *  ^0 


=  p  •  (#  +  i?2  + 


which  is  about  p  -  B2.  Thus  when  adding  ciphertexts  with  noise  B  the  noise  increases  to  2  •  B,  but 
when  multiplying  we  get  noise  p  •  B2 . 

Eventually,  if  we  keep  performing  operations  the  noise  will  become  too  large  and  decryption 
will  fail,  in  particular  when  the  noise  is  larger  than  q/ 4.  Thus  if  we  make  q  larger  we  can  ensure 
that  we  can  evaluate  more  operations  homomorphically.  However,  increasing  q  means  we  also  need 
to  increase  the  degree  of  our  ring  to  ensure  the  hardness  of  the  Ring-LWE  problem.  In  this  way  we 
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obtain  a  Somewhat  Homomorphic  Encryption  (SHE)  scheme,  i.e.  one  which  can  evaluate  a  limited 
number  of  addition  and  multiplication  operations.  To  obtain  a  Fully  Homomorphic  Encryption 
(FHE)  scheme  we  need  to  be  able  to  perform  an  arbitrary  number  of  additions  and  multiplications. 

The  standard  theoretical  trick  to  obtain  an  FHE  scheme  from  an  SHE  scheme  is  via  a  technique 
called  bootstrapping.  We  outline  the  basic  idea  behind  bootstrapping  as  the  details  are  very 
complicated,  and  the  technique  is  currently  impractical  bar  for  toy  examples.  The  decryption 
algorithm  D  is  a  function  of  the  ciphertext  c  and  the  secret  key  s.  We  can  thus  represent  the 
function  D  as  an  arithmetic  circuit  C  (i.e.  a  circuit  involving  only  additions  and  multiplications) 
of  representations  of  c  and  5.  To  do  this  we  first  have  to  convert  the  representations  of  c  and  5 
to  be  elements  of  Rp ,  but  taking  the  bit  representations  and  using  a  binary  circuit  is  an  easy  (but 
inefficient)  way  of  doing  this. 

Now  suppose  C  is  “simple”  enough  to  be  evaluated  by  the  SHE  scheme,  and  we  can  do  a  little 
more  than  just  evaluate  C,  say  perform  another  multiplication.  Our  Ring-LWE  scheme  indeed  has 
a  simple  decryption  circuit,  since  the  only  difficult  part  is  reduction  modulo  q  and  then  modulo  p, 
the  main  decryption  equation  being  linear.  We  can  then  “refresh”  or  “bootstrap”  a  ciphertext  c 
by  homomorphically  evaluating  the  circuit  C  on  the  actual  ciphertext  c  and  the  encryption  of  the 
secret  key  5.  In  other  words,  we  add  to  the  public  key  the  encryption  of  the  secret  key,  and  then 
by  homomorphically  evaluating  C  we  obtain  a  new  encryption  of  the  message  encrypted  by  c,  but 
with  a  smaller  noise  value. 


Chapter  Summary 


•  Complexity  theory  deals  with  the  worst-case  behaviour  of  algorithms  to  solve  a  given 
decision  problem;  some  problems  are  easy  on  average,  but  there  exist  certain  instances 
which  are  very  hard  to  solve. 

•  Problems  such  as  the  RSA  and  DDH  problems  are  hard  on  average,  since  they  possess 
random  self-reductions  from  a  given  instance  of  the  problem  to  a  random  instance  of  the 
problem. 

•  Cryptographic  systems  based  on  knapsack  problems  have  been  particularly  notorious  from 
this  perspective,  as  one  can  often  use  lattice  basis  reduction  to  break  them. 

•  To  obtain  secure  systems  based  on  lattices  one  should  select  lattice  problems  which  have 
worst-case  to  average-case  reductions. 

•  One  such  problem  is  the  LWE  problem,  another  is  the  closely  related  Ring-LWE  problem. 

•  These  lattice  problems  allow  us  to  construct  fully  homomorphic  encryption  schemes  and 
schemes  which  appear  to  resist  quantum  attacks. 


Further  Reading 

A  nice  introduction  to  complexity  theory  can  be  found  in  the  book  by  Goldreich.  A  discussion 
of  knapsack  based  systems  and  how  to  break  them  using  lattices  can  be  found  in  the  survey  article 
by  Odlyzko.  A  good  survey  of  the  more  modern  constructive  applications  of  lattices,  including 
LWE,  can  be  found  in  the  paper  by  Micciancio  and  Regev. 
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CHAPTER  18 


Certificates,  Key  Transport  and  Key  Agreement 


Chapter  Goals 

•  To  understand  the  problems  associated  with  managing  and  distributing  secret  keys. 

•  To  introduce  the  notion  of  digital  certificates  and  a  PKI. 

•  To  show  how  an  implicit  certificate  scheme  can  operate. 

•  To  learn  about  key  distribution  techniques  based  on  symmetric-key-based  protocols. 

•  To  introduce  key  transport  based  on  public  key  encryption. 

•  To  introduce  DifRe-Hellman  key  exchange,  and  its  various  variations. 

•  To  introduce  the  symbolic  and  computational  analysis  of  protocols. 

18.1.  Introduction 

We  can  now  perform  encryption  and  authentication  using  either  public  key  techniques  or  symmetric 
key  techniques.  However,  we  have  not  addressed  how  parties  actually  obtain  the  public  key  of  an 
entity,  or  obtain  a  shared  symmetric  key.  When  using  hybrid  ciphers  in  Section  16.3  we  showed 
how  a  symmetric  key  could  be  transported  to  another  party,  and  then  that  symmetric  key  used 
to  encrypt  a  single  message.  However,  we  did  not  address  what  happens  if  we  want  to  use  the 
symmetric  key  many  times,  or  use  it  for  authentication  etc.  Nor  did  we  address  how  the  sender 
would  know  that  the  public  key  of  the  receiver  he  was  using  was  genuine. 

In  this  chapter  we  present  methodologies  to  solve  all  of  these  problems.  In  doing  so  we  present 
our  first  examples  of  what  can  be  called  “cryptographic  protocols” .  A  cryptographic  protocol  is  an 
exchange  of  messages  which  achieves  some  cryptographic  goal.  Up  until  now  we  have  created  single 
shot  mechanisms  (encryption,  MAC,  signatures  etc.)  which  have  not  required  interaction  between 
parties. 

In  dealing  with  cryptographic  protocols  we  still  need  to  define  what  we  mean  by  something 
being  secure,  and  when  presenting  a  protocol  we  need  to  present  a  proof  as  to  why  we  believe  the 
protocol  meets  our  security  definition.  However,  unlike  the  proofs  for  encryption  etc.  that  we  have 
met  before,  the  proofs  for  protocols  are  incredibly  complex.  They  fall  into  one  of  two  camps1: 

•  In  the  first  camp  are  so-called  “symbolic  methods” ,  these  treat  underlying  primitives  such 
as  encryption  as  perfect  black  boxes  and  then  try  to  show  that  a  protocol  is  “secure”. 
However,  as  they  work  at  a  very  high  level  of  abstraction  they  do  not  really  prove  security, 
they  simply  enable  one  to  find  attacks  on  a  protocol  relatively  easily.  This  is  because  in 
this  camp  one  shows  a  protocol  is  insecure  by  exhibiting  an  attack. 

•  The  second  camp  is  like  our  security  games  for  encryption  etc.  We  define  a  game,  with  an 
adversary.  The  adversary  has  certain  powers  (given  by  oracles),  and  has  a  certain  goal. 
We  then  present  proofs  that  the  existence  of  such  an  adversary  implies  the  existence  of  an 
algorithm  which  can  solve  a  hard  problem.  These  proofs  provide  a  high  degree  of  assurance 

Where  is  a  third  camp  called  the  simulation  paradigm ,  typified  by  something  called  the  Universal  Composability 
(UC)  framework.  This  camp  is  beyond  the  scope  of  this  book  however. 
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that  the  protocol  is  sound,  and  any  security  flaws  will  come  from  implementation  aspects 
as  opposed  to  protocol  design  issues. 

Before  we  continue  we  need  to  distinguish  between  different  types  of  keys.  The  following  terminology 
will  be  used  throughout  this  chapter  and  beyond: 

•  Static  (or  Long-Term)  Keys:  These  are  keys  which  are  to  be  in  use  for  a  long  time 
period.  The  exact  definition  of  long  will  depend  on  the  application,  but  this  could  mean 
from  a  few  hours  to  a  few  years.  The  compromising  of  a  static  key  is  usually  considered 
to  be  a  major  problem,  with  potentially  catastrophic  consequences. 

•  Ephemeral,  or  Session  (or  Short-Term)  Keys:  These  are  keys  which  have  a  short 
lifetime,  maybe  a  few  seconds  or  a  day.  They  are  usually  used  to  provide  confidentiality 
for  a  given  time  period.  The  compromising  of  a  session  key  should  only  result  in  the 
compromising  of  that  session’s  secrecy  and  it  should  not  affect  the  long-term  security  of 
the  system. 

Key  distribution  is  one  of  the  fundamental  problems  of  cryptography.  There  are  a  number  of 
solutions  to  this  problem;  which  solution  one  chooses  depends  on  the  overall  system. 

•  Physical  Distribution:  Using  trusted  couriers  or  armed  guards,  keys  can  be  distributed 
using  traditional  physical  means.  Until  the  1970s  this  was  in  effect  the  only  secure  way  of 
distributing  keys  at  system  setup.  It  has  a  large  number  of  physical  problems  associated 
with  it,  especially  scalability,  but  the  main  drawback  is  that  security  no  longer  rests  with 
the  key  but  with  the  courier.  If  we  can  bribe,  kidnap  or  kill  the  courier  then  we  have 
broken  the  system. 

•  Distribution  Using  Symmetric  Key  Protocols:  Once  some  secret  keys  have  been 
distributed  between  a  number  of  users  and  a  trusted  central  authority,  we  can  use  the 
trusted  authority  to  help  generate  keys  for  any  pair  of  users  as  the  need  arises.  Protocols 
to  perform  this  task  will  be  discussed  in  this  chapter.  They  are  usually  very  efficient  but 
have  some  drawbacks.  In  particular  they  usually  assume  that  both  the  trusted  authority 
and  the  two  users  who  wish  to  agree  on  a  key  are  both  online.  They  also  still  require  a 
physical  means  to  set  up  the  initial  keys. 

•  Distribution  Using  Public  Key  Protocols:  Using  public  key  cryptography,  two  par¬ 
ties,  who  have  never  met  or  who  do  not  trust  any  one  single  authority,  can  produce  a 
shared  secret  key.  This  can  be  done  in  an  online  manner,  using  a  key  exchange  proto¬ 
col.  Indeed  this  is  the  most  common  application  of  public  key  techniques  for  encryption. 
Rather  than  encrypting  large  amounts  of  data  by  public  key  techniques  we  agree  a  key  by 
public  key  techniques  and  then  use  a  symmetric  cipher  to  actually  do  the  encryption,  a 
methodology  we  saw  earlier  when  we  discussed  hybrid  encryption. 

To  understand  the  scale  of  the  problem,  if  our  system  is  to  cope  with  n  separate  users,  and  each 
user  may  want  to  communicate  securely  with  any  other  user,  then  we  require 

n  •  (n  —  1) 

2 

separate  symmetric  keys.  This  soon  produces  huge  key  management  problems;  a  small  university 
with  around  10  000  students  would  need  to  have  around  fifty  million  separate  secret  keys. 

With  a  large  number  of  keys  in  existence  one  finds  a  large  number  of  problems.  For  example 
what  happens  when  your  key  is  compromised?  In  other  words,  if  someone  else  has  found  your  key. 
What  can  you  do  about  it?  What  can  they  do?  Hence,  a  large  number  of  keys  produces  a  large 
key  management  problem. 

One  solution  is  for  each  user  to  hold  only  one  key  with  which  they  communicate  with  a  central 
authority,  hence  a  system  with  n  users  will  only  require  n  keys.  When  two  users  wish  to  communi¬ 
cate,  they  generate  a  secret  key  which  is  only  to  be  used  for  that  message,  a  so-called  session  key 
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or  ephemeral  key.  This  session  key  can  be  generated  with  the  help  of  the  central  authority  using 
one  of  the  protocols  that  appear  later  in  this  chapter. 

As  we  have  mentioned  already  the  main  problem  is  one  of  managing  the  secure  distribution  of 
keys.  Even  a  system  which  uses  a  trusted  central  authority  needs  some  way  of  getting  the  keys 
shared  between  the  centre  and  each  user  out  to  the  user.  One  possible  solution  is  key  splitting 
(more  formally  called  secret  sharing)  where  we  divide  the  key  into  a  number  of  shares 

k  =  hi  ©  kz  0  •  •  •  ©  kr. 

The  shares  are  then  distributed  via  separate  routes.  The  beauty  of  this  is  that  an  attacker  needs  to 
attack  all  the  routes  so  as  to  obtain  the  key.  On  the  other  hand  attacking  one  route  will  stop  the 
legitimate  user  from  recovering  the  key.  We  will  discuss  secret  sharing  in  more  detail  in  Chapter 
19. 

One  issue  one  needs  to  consider  when  generating  and  storing  keys  is  the  key  lifetime.  A  general 
rule  is  that  the  longer  the  key  is  in  use  the  more  vulnerable  it  will  be  and  the  more  valuable  it 
will  be  to  an  attacker.  We  have  already  touched  on  this  when  mentioning  the  use  of  session  keys. 
However,  it  is  important  to  destroy  keys  properly  after  use.  Relying  on  an  operating  system  to 
delete  a  hie  by  typing  del  or  rm  does  not  mean  that  an  attacker  cannot  recover  the  hie  contents  by 
examining  the  hard  disk.  Usually  deleting  a  hie  does  not  destroy  the  hie  contents,  it  only  signals 
that  the  file’s  location  is  now  available  for  overwriting  with  new  data.  A  similar  problem  occurs 
when  deleting  memory  in  an  application. 

This  (rather  lengthy)  chapter  is  organized  in  the  following  sections.  We  hrst  introduce  the 
notion  of  certihcates  and  a  Public  Key  Infrastructure  (PKI);  such  techniques  allow  parties  to  obtain 
authentic  public  keys.  We  then  discuss  a  number  of  protocols  which  allow  parties  to  agree  new 
symmetric  ephemeral  keys,  given  already  deployed  static  symmetric  keys.  Then  we  discuss  how 
to  obtain  new  symmetric  ephemeral  keys,  given  existing  public  keys  (which  can  be  authenticated 
via  a  PKI).  Then  to  give  a  havour  of  how  protocols  are  analysed  we  present  a  simple  symbolic 
method  called  BAN  Logic  and  apply  it  to  one  of  our  symmetric- key-based  protocols.  We  then 
give  a  detailed  exposition  of  how  one  uses  game  style  security  definitions  to  show  that  the  public- 
key-based  protocols  are  secure.  Note  that  we  can  use  symbolic  methods  to  analyse  public-key- 
based  techniques,  and  game  style  security  to  analyse  symmetric-key-based  techniques.  However, 
for  reasons  of  space  we  only  give  a  limited  set  of  examples. 

18.2.  Certificates  and  Certificate  Authorities 

When  using  a  symmetric  key  system  we  assume  we  do  not  have  to  worry  about  which  key  belongs 
to  which  party.  It  is  tacitly  assumed  that  if  Alice  holds  a  long-term  secret  key  K ^  which  she  thinks 
is  shared  with  Bob,  then  Bob  really  does  have  a  copy  of  the  same  key.  This  assurance  is  often 
achieved  using  a  trusted  physical  means  of  long-term  key  distribution,  for  example  using  armed 
couriers. 

In  a  public  key  system  the  issues  are  different.  Alice  may  have  a  public  key  which  she  thinks  is 
associated  with  Bob,  but  we  usually  do  not  assume  that  Alice  is  one  hundred  percent  certain  that 
it  really  belongs  to  Bob.  This  is  because  we  do  not,  in  the  public  key  model,  assume  a  physically 
secure  key  distribution  system.  After  all,  that  was  one  of  the  points  of  public  key  cryptography  in 
the  hrst  place:  to  make  key  management  easier.  Alice  may  have  obtained  the  public  key  she  thinks 
belongs  to  Bob  from  Bob’s  web  page,  but  how  does  she  know  the  web  page  has  not  been  spoofed? 

The  process  of  linking  a  public  key  to  an  entity  or  principal,  be  it  a  person,  machine  or  process, 
is  called  binding.  One  way  of  binding,  common  in  many  applications  where  the  principal  really  does 
need  to  be  present,  is  by  using  a  physical  token  such  as  a  smart  card.  Possession  of  the  token,  and 
knowledge  of  any  PIN/password  needed  to  unlock  the  token,  is  assumed  to  be  equivalent  to  being 
the  designated  entity.  This  solution  has  a  number  of  problems  associated  with  it,  since  cards  can 
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be  lost  or  stolen,  which  is  why  we  protect  them  using  a  PIN  (or  in  more  important  applications  by 
using  biometrics).  The  major  problem  is  that  most  entities  are  non- human;  they  are  computers  and 
computers  do  not  carry  cards.  In  addition  many  public  key  protocols  are  performed  over  networks 
where  physical  presence  of  the  principal  (if  they  are  human)  is  not  something  one  can  test. 

Hence,  some  form  of  binding  is  needed  which  can  be  used  in  a  variety  of  very  different  applica¬ 
tions.  The  main  binding  tool  in  use  today  is  the  digital  certificate.  In  this  a  special  trusted  third 
party,  or  TTP,  called  a  certificate  authority,  or  CA,  is  used  to  vouch  for  the  validity  of  the  public 
keys.  A  certificate  authority  system  works  as  follows: 

•  All  users  have  a  trusted  copy  of  the  public  key  of  the  CA.  For  example  these  come  embedded 
in  your  browser  when  you  buy  your  computer,  and  you  “of  course”  trust  the  vendor  of  the 
computer  and  the  manufacturer  of  the  software  on  your  computer. 

•  The  CA’s  job  is  to  digitally  sign  data  strings  containing  the  following  information 

(Alice,  Alice’s  public  key). 

This  data  string  and  the  associated  signature  is  called  a  digital  certificate.  The  CA  will 
only  sign  this  data  if  it  truly  believes  that  the  public  key  really  does  belong  to  Alice. 

•  When  Alice  now  sends  you  her  public  key,  contained  in  a  digital  certificate,  you  now  trust 
that  the  purported  key  really  is  that  of  Alice,  since  you  trust  the  CA  to  do  its  job  correctly. 

This  use  of  a  digital  certificate  binds  the  name  “Alice”  with  the  key.  Public  key  certificates  will 
typically  (although  not  always)  be  stored  in  repositories  and  accessed  as  required.  For  example, 
most  browsers  keep  a  list  of  the  certificates  that  they  have  come  across.  The  digital  certificates  do 
not  need  to  be  stored  securely  since  they  cannot  be  tampered  with  as  they  are  digitally  signed. 


To  see  the  advantage  of  certificates  and  CAs  in  more  detail  consider  the  following  example  of  a 
world  without  a  CA.  In  the  following  discussion  we  break  with  our  colour  convention  for  a  moment 
and  now  use  red  to  signal  public  keys  which  must  be  obtained  in  an  authentic  manner  and  blue  to 
signal  public  keys  which  do  not  need  to  be  obtained  in  an  authentic  manner. 

In  a  world  without  a  CA  you  obtain  many  individual  public  keys  from  each  individual  in  some 
authentic  fashion.  For  example 


6A5DEF....A21  Jim  Bean’s  public  key, 

7F341A....BFF  Jane  Doe’s  public  key, 

B5F34A....E6D  Microsoft’s  update  key. 

Hence,  each  key  needs  to  be  obtained  in  an  authentic  manner,  as  does  every  new  key  you  obtain. 

Now  consider  the  world  with  a  CA.  You  obtain  a  single  public  key  in  an  authentic  manner, 
namely  the  CA’s  public  key.  We  shall  call  our  CA  Ted  since  he  is  Trustworthy.  You  then  obtain 
many  individual  public  keys,  signed  by  the  CA,  in  possibly  an  unauthentic  manner.  For  example 
they  could  be  attached  at  the  bottom  of  an  email,  or  picked  up  whilst  browsing  the  web. 


A45EFB....C45 

6A5DEF....A21 

7F341A....BFF 

B5F34A....E6D 


Ted’s  totally  trustworthy  key, 

Ted  says  “This  is  Jim  Bean’s  public  key”, 
Ted  says  “This  is  Jane  Doe’s  public  key”, 
Ted  says  “This  is  Microsoft’s  update  key”. 


If  you  trust  Ted’s  key  and  you  trust  Ted  to  do  his  job  correctly  then  you  trust  all  the  public  keys 
you  hold  to  be  authentic. 

In  general  a  digital  certificate  is  not  just  a  signature  on  the  single  pair  (Alice,  Alice’s  public 
key);  one  can  place  all  sorts  of  other,  possibly  application  specific,  information  into  the  certificate. 
For  example  it  is  usual  for  the  certificate  to  contain  the  following  information. 

•  User’s  name, 

•  User’s  public  key, 
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•  Is  this  an  encryption  or  signing  key? 

•  Name  of  the  CA, 

•  Serial  number  of  the  certificate, 

•  Expiry  date  of  the  certificate, 

Commercial  certificate  authorities  exist  that  will  produce  a  digital  certificate  for  your  public  key, 
often  after  payment  of  a  fee  and  some  checks  on  whether  you  are  who  you  say  you  are.  The 
certificates  produced  by  commercial  CAs  are  often  made  public,  so  one  could  call  them  public 
“public  key  certificates” ,  in  that  their  use  is  mainly  over  open  public  networks.  CAs  are  also  used 
in  proprietary  closed  systems,  for  example  in  debit/credit  card  systems  or  by  large  corporations. 

It  is  common  for  more  than  one  CA  to  exist.  A  quick  examination  of  the  properties  of  your 
web  browser  will  reveal  a  large  number  of  certificate  authorities  which  your  browser  assumes  you 
“trust”  to  perform  the  function  of  a  CA.  As  there  is  more  than  one  CA  it  is  common  for  one  CA  to 
sign  a  digital  certificate  containing  the  public  key  of  another  CA,  and  vice  versa,  a  process  which 
is  known  as  cross-certification. 

Cross-certification  is  needed  if  more  than  one  CA  exists,  since  a  user  may  not  have  a  trusted 
copy  of  the  CA’s  public  key  needed  to  verify  another  user’s  digital  certificate.  This  is  solved  by 
cross-certificates,  i.e.  one  CA’s  public  key  is  signed  by  another  CA.  The  user  first  verifies  the 
appropriate  cross-certificate,  and  then  verifies  the  user  certificate  itself. 

With  many  CAs  one  can  get  quite  long  certificate  chains,  as  Figure  18.1  illustrates.  Suppose 
Bob  trusts  the  Root  CA’s  public  key  and  he  obtains  Alice’s  public  key  which  is  signed  by  the 
private  key  of  CA3.  He  then  obtains  CA3’s  public  key,  either  along  with  Alice’s  digital  certificate 
or  by  some  other  means.  CA3’s  public  key  comes  in  a  certificate  which  is  signed  by  the  private  key 
of  CA1.  Bob  then  needs  to  obtain  the  public  key  of  CA1,  which  will  be  contained  in  a  certificate 
signed  by  the  Root  CA.  Hence,  by  verifying  all  the  signatures  he  ends  up  trusting  Alice’s  public 
key. 


Root  CA  (say) 


CA1 


Alice 


CA2 


Bob 


Figure  18.1.  Example  certification  hierarchy 


Often  the  function  of  a  CA  is  split  into  two  parts.  One  part  deals  with  verifying  the  user’s 
identity  and  one  part  actually  signs  the  public  keys.  The  signing  is  performed  by  the  CA,  whilst 
the  identity  of  the  user  is  parcelled  out  to  a  registration  authority,  or  RA.  This  can  be  a  good 
practice,  with  the  CA  implemented  in  a  more  secure  environment  to  protect  the  long-term  private 
key. 

The  main  problem  with  a  CA  system  arises  when  a  user’s  public  key  is  compromised  or  becomes 
untrusted  for  some  reason.  For  example 

•  A  third  party  has  gained  knowledge  of  the  private  key, 

•  An  employee  leaves  the  company. 
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As  the  public  key  is  no  longer  to  be  trusted  all  the  associated  digital  certificates  are  now  invalid 
and  need  to  be  revoked.  But  these  certificates  can  be  distributed  over  a  large  number  of  users,  each 
one  of  which  needs  to  be  told  to  no  longer  trust  this  certificate.  The  CA  must  somehow  inform 
all  users  that  the  certificate (s)  containing  this  public  key  is/are  no  longer  valid,  in  a  process  called 
certificate  revocation. 

One  way  to  accomplish  this  is  via  a  Certificate  Revocation  List,  or  CRL,  which  is  a  signed 
statement  by  the  CA  containing  the  serial  numbers  of  all  certificates  which  have  been  revoked  by 
that  CA  and  whose  validity  period  has  not  expired.  One  clearly  need  not  include  in  this  the  serial 
numbers  of  certificates  which  have  passed  their  expiry  date.  Users  must  then  ensure  they  have 
the  latest  CRL.  This  can  be  achieved  by  issuing  CRLs  at  regular  intervals  even  if  the  list  has  not 
changed.  Such  a  system  can  work  well  in  a  corporate  environment  when  overnight  background  jobs 
are  often  used  to  make  sure  each  desktop  computer  in  the  company  is  up  to  date  with  the  latest 
software  etc.  For  other  situations  it  is  hard  to  see  how  the  CRLs  can  be  distributed,  especially  if 
there  are  a  large  number  of  CAs  trusted  by  each  user. 

The  whole  system  of  CAs  and  certificates  is  often  called  the  Public  Key  Infrastructure,  or  PKI. 
This  essentially  allows  a  distribution  of  trust;  the  need  to  trust  the  authenticity  of  each  individual 
public  key  in  your  possession  is  replaced  by  the  need  to  trust  a  body,  the  CA,  to  do  its  job  correctly. 

18.2.1.  Implicit  Certificates:  One  issue  with  digital  certificates  is  that  they  can  be  rather  large. 
Each  certificate  needs  to  at  least  contain  both  the  public  key  of  the  user  and  the  signature  of  the 
certificate  authority  on  that  key.  This  can  lead  to  quite  large  certificate  sizes,  as  the  following  table 
demonstrates: 

RSA  DSA  EC-DSA 
User’s  key  2024  2048  256 

CA  sig  2024  512  512 

This  assumes  that  for  RSA  keys  one  uses  a  2048-bit  modulus,  for  DSA  one  uses  a  2048-bit  prime 
p  and  a  256-bit  prime  q  and  for  EC-DSA  one  uses  a  256-bit  curve.  Hence,  for  example,  if  the  CA 
is  using  2048-bit  RSA  and  they  are  signing  the  public  key  of  a  user  using  2048-bit  DSA  then  the 
total  certificate  size  must  be  at  least  4096  bits. 

Implicit  certificates  enable  these  sizes  to  be  reduced  somewhat.  An  implicit  certificate  looks 
like  X\Y  where 

•  X  is  the  data  being  bound  to  the  public  key, 

•  Y  is  the  implicit  certificate  on  X. 

From  Y  we  need  to  be  able  to  recover  the  public  key  being  bound  to  X  and  implicit  assurance  that 
the  certificate  was  issued  by  the  CA.  In  the  system  we  describe  below,  based  on  a  DSA  or  EC-DSA, 
the  size  of  Y  will  be  2048  or  256  bits  respectively.  Hence,  the  size  of  the  certificate  is  reduced  to 
the  size  of  the  public  key  being  certified. 

System  Set-up:  The  CA  chooses  a  public  group  G  of  known  order  n  and  an  element  P  E  G.  The 
CA  then  chooses  a  long-term  private  key  c  and  computes  the  public  key  Q  <—  Pc.  This  public  key 
should  be  known  to  all  users. 

Certificate  Request:  Suppose  Alice  wishes  to  request  a  certificate  and  the  public  key  associated 
with  the  information  7D,  which  could  be  her  name.  Alice  computes  an  ephemeral  secret  key  t  and 
an  ephemeral  public  key  R  <—  Pl .  Alice  sends  R  and  ID  to  the  CA. 

Processing  of  the  Request:  The  CA  checks  that  he  wants  to  link  ID  with  Alice.  The  CA  picks 
another  random  number  k  and  computes 

g  <r-  PkR  =  PkPt  =  Pk+t. 
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Then  the  CA  computes  s  <—  cH(ID\\g)  +  k  (mod  n).  Then  the  CA  sends  back  to  Alice  the  pair 
(g,  s ).  The  implicit  certificate  is  the  pair  (ID,  g).  We  now  have  to  convince  you  that,  not  only  can 
Alice  recover  a  valid  public/private  key  pair,  but  also  any  other  user  can  recover  Alice’s  public  key 
from  this  implicit  certificate. 

Alice’s  Key  Discovery:  Alice  knows  the  following  information:  t,s,R  =  Pl .  From  this  she  can 
recover  her  private  key  via  a  <—  t  +  s  (mod  n).  Note  that  Alice’s  private  key  is  known  only  to  Alice 
and  not  to  the  CA.  In  addition  Alice  has  contributed  some  randomness  t  to  her  private  key,  as  has 
the  CA  who  contributed  k.  Her  public  key  is  then  Pa  =  Pt+S  =  pfps  =  R  .  Ps . 

User’s  Key  Discovery:  Since  5  and  R  are  public,  a  user,  say  Bob,  can  recover  Alice’s  public  key 
from  the  above  message  flows  via  R-  Ps .  But  this  says  nothing  about  the  linkage  between  the  CA, 
Alice’s  public  key  and  the  ID  information.  Instead  Bob  recovers  the  public  key  from  the  implicit 
certificate  (ID,g)  and  the  CA’s  public  key  Q  via  the  equation  Pa  =  QHPDWA g. 

As  soon  as  Bob  sees  Alice’s  key  used  in  action,  say  he  verifies  a  signature  purported  to  have 
been  made  by  Alice,  he  knows  implicitly  that  it  must  have  been  issued  by  the  CA,  since  otherwise 
Alice’s  signature  would  not  verify  correctly. 

There  are  a  number  of  problems  with  the  above  system  which  mean  that  implicit  certificates 
are  not  used  much  in  real  life.  For  example: 

(1)  What  do  you  do  if  the  CA’s  key  is  compromised?  Usually  you  pick  a  new  CA  key  and 
re-certify  the  user’s  keys.  But  you  cannot  do  this  since  the  user’s  public  key  is  chosen 
interactively  during  the  cert ihcat ion  process. 

(2)  Implicit  certificates  require  the  CA  and  users  to  work  at  the  same  security  level.  This  is 
not  considered  good  practice,  as  usually  one  expects  the  CA  to  work  at  a  higher  security 
level  (say  4096-bit  DSA)  than  the  user  (say  2048-bit  DSA). 

However  for  devices  with  restricted  bandwidth  implicit  certificates  can  offer  a  suitable  alternative 
where  traditional  certificates  are  not  viable. 

18.3.  Fresh  Ephemeral  Symmetric  Keys  from  Static  Symmetric  Keys 

Recall  that  if  we  have  n  users  each  pair  of  whom  wishes  to  communicate  securely  with  each  other 
then  we  would  require 

n  •  (n  —  1) 

2 

separate  static  symmetric  key  pairs.  As  remarked  earlier  this  leads  to  huge  key  management 
problems  and  issues  related  to  the  distribution  of  the  keys.  We  have  already  mentioned  that  it  is 
better  to  use  session  keys  and  few  long-term  keys,  but  we  have  not  explained  how  one  deploys  the 
session  keys. 

To  solve  this  problem  a  number  of  protocols  which  make  use  of  symmetric  key  cryptography  to 
distribute  secret  session  keys  have  been  developed,  some  of  which  we  shall  describe  in  this  section. 
Later  we  shall  look  at  public  key  techniques  for  this  problem,  which  are  often  more  elegant.  We 
first  need  to  set  up  some  notation  to  describe  the  protocols.  Firstly  we  set  up  the  names  of  the 
parties  and  quantities  involved. 

•  Parties/Principals:  A,B,S. 

Assume  the  two  parties  who  wish  to  agree  a  secret  are  A  and  B,  for  Alice  and  Bob.  We 
assume  that  they  will  use  a  trusted  third  party,  or  TTP,  which  we  shall  denote  by  S. 

•  Shared  Secret  Keys:  Kab-  Kt)S-  Kas- 

Kab  will  denote  a  secret  key  known  only  to  A  and  B. 

•  Nonces:  Na,  N^. 

Just  as  in  Chapter  13  nonces  are  numbers  used  only  once;  they  do  not  need  to  be  random, 
just  unique.  The  quantity  Na  will  denote  a  nonce  originally  produced  by  the  principal  A. 
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Note  that  other  notations  for  nonces  are  possible  and  we  will  introduce  them  as  the  need 
arises. 

•  Timestamps:  Ta,Tfc,Ts. 

The  quantity  Ta  is  a  timestamp  produced  by  A.  When  timestamps  are  used  we  assume 
that  the  parties  try  to  keep  their  clocks  in  synchronization  using  some  other  protocol. 

The  statement 

A  — »  B  :  M,A,B,  { Na ,  M,  A,  B}Kas 

means  A  sends  to  B  the  message  to  the  right  of  the  colon.  The  message  consists  of 

•  A  nonce  M, 

•  A  the  name  of  party  A, 

•  B  the  name  of  party  B , 

•  A  message  {7Va,M,  A,  Bj  encrypted  under  the  key  Kas  which  A  shares  with  S.  Hence, 
the  recipient  B  is  unable  to  read  the  encrypted  part  of  this  message. 

18.3.1.  Wide-Mouth  Frog  Protocol:  Our  first  protocol  is  the  Wide-Mouth  Frog  protocol, 
which  is  a  simple  protocol  invented  by  Burrows.  The  protocol  transfers  a  key  K ^  from  A  to 
B  via  S ;  it  uses  only  two  messages  but  has  a  number  of  drawbacks.  In  particular  it  requires  the  use 
of  synchronized  clocks,  which  can  cause  a  problem  in  implementations.  In  addition  the  protocol 
assumes  that  A  chooses  the  session  key  Kab  and  then  transports  this  key  to  user  B.  This  implies 
that  user  A  is  trusted  by  user  B  to  be  competent  in  making  and  keeping  keys  secret.  This  is  a  very 
strong  assumption  and  the  main  reason  that  this  protocol  is  not  used  much  in  real  life.  However,  it 
is  very  simple  and  gives  a  good  example  of  how  to  analyse  a  protocol  formally,  which  we  shall  come 
to  later  in  this  chapter.  The  protocol  proceeds  in  the  following  steps,  as  illustrated  in  Figure  18.2, 

A  — >  S  :  A,  {Ta,  B,  Kab}Kas , 

S  — B  :  {Ts,A,Kab}Kbs. 

On  obtaining  the  first  message  the  trusted  third  party  S  decrypts  the  last  part  of  the  message  and 
checks  that  the  timestamp  is  recent.  This  decrypted  message  tells  S  it  should  forward  the  key  to 
the  party  called  B.  If  the  timestamp  is  verified  to  be  recent,  S  encrypts  the  key  along  with  its 
timestamp  and  passes  this  encryption  on  to  B.  On  obtaining  this  message  B  decrypts  the  message 
received  and  checks  the  time  stamp  is  recent,  then  he  can  recover  both  the  key  K ^  and  the  name 
A  of  the  person  who  wants  to  send  data  to  him  using  this  key.  The  checks  on  the  timestamps  mean 
the  session  key  should  be  recent,  in  that  it  left  user  A  a  short  time  ago.  However,  user  A  could 
have  generated  this  key  years  ago  and  stored  it  on  her  hard  disk,  in  which  time  Eve  broke  in  and 
took  a  copy  of  this  key. 

We  already  said  that  this  protocol  requires  that  all  parties  need  to  keep  synchronized  clocks. 
However,  this  is  not  such  a  big  problem  since  S  checks  or  generates  all  the  timestamps  used  in  the 
protocol.  Hence,  each  party  only  needs  to  record  the  difference  between  its  clock  and  the  clock 
owned  by  S.  Clocks  are  then  updated  if  a  clock  drift  occurs  which  causes  the  protocol  to  fail.  This 
protocol  is  really  too  simple;  much  of  the  simplicity  comes  by  assuming  synchronized  clocks  and  by 
assuming  party  A  can  be  trusted  with  creating  session  keys. 

18.3.2.  Needham— Schroeder  Protocol:  We  shall  now  look  at  more  complicated  protocols, 
starting  with  one  of  the  most  famous,  namely  the  Needham-Schroeder  protocol.  This  protocol 
was  developed  in  1978,  and  is  one  of  most  highly  studied  protocols  ever;  its  fame  is  due  to  the  fact 
that  even  a  simple  protocol  can  hide  security  flaws  for  a  long  time.  We  shall  build  the  protocol  up 
slowly,  starting  with  a  simple,  obviously  insecure,  protocol  and  then  addressing  each  issue  we  find 
until  we  reach  the  final  protocol.  Our  goal  is  to  come  up  with  a  protocol  which  does  not  require 
synchronized  clocks. 
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Bob 


TTP 


Alice 


as 


Figure  18.2.  Wide-Mouth  Frog  protocol 

In  all  of  our  analysis  we  assume  that  the  attacker  has  complete  control  of  the  network;  she 
can  delay,  replay  and  delete  messages.  She  can  even  masquarade  as  legitimate  entities,  until  those 
entities  prove  who  they  are  via  cryptographic  means.  This  model  of  the  network  and  attacker  is 
called  the  Dolev-Yao  model. 

Our  first  simple  protocol  is  given  by  the  following  message  flows,  and  is  illustrated  in  Figure  18.3, 

A  — >  S  :  A,  B, 

S  — >  A:  Kab, 

A  — »  B  :  Kab,  A. 

In  this  protocol  the  trusted  server  S  generates  the  key,  after  it  is  told  by  Alice  to  generate  a  key 


Bob 

3:  Kab,A 

Alice 

Figure  18.3.  Protocol  Version  1 

for  use  between  Alice  and  Bob.  At  the  end  of  the  protocol  Bob  is  told  by  Alice  that  the  key  Kab 
is  one  he  should  use  for  communication  with  Alice.  However,  it  is  immediately  obvious  that  this 
protocol  is  not  secure,  since  any  eavesdropper  can  learn  the  secret  key  Kab,  since  it  is  transmitted 
unencrypted. 
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So  our  first  modification  is  to  create  a  protocol  which  enables  the  key  to  remain  secret,  however 
we  need  to  do  this  utilizing  only  the  long-term  static  secret  keys  Kas  and  Kbs.  With  this  modification 
our  protocol  is  formed  of  the  following  message  flows,  and  is  illustrated  in  Figure  18.4, 


A 

S 

A 


A  S 


A  A 


A  B 


A,B, 

{ Kab}Kbs ,  {Kab}K, 
{Kab}Kbs,A. 


as  5 


This  is  slightly  better,  but  now  we  notice  that  the  attacker  can  take  the  last  message  from  Alice 


Figure  18.4.  Protocol  Version  2 

to  Bob,  and  replace  it  with  {Kab}xhs,  D.  This  means  that  Bob  will  think  that  the  key  K ab  is  for 
communicating  with  Diana  and  not  Alice.  Thus  we  do  not  have  the  property  that  Bob  knows  to 
whom  he  is  sending  information.  Imagine  that  Bob  is  in  love  with  Diana  and  so  encrypts  a  message 
under  K ab  saying  “I  love  you” .  This  message  can  only  be  decrypted  by  Alice,  so  the  adversary  now 
redirects  the  ciphertext  to  Alice.  Alice  decrypts  the  message  and  apparently  finds  out  that  Bob  is 
in  love  with  her.  We  easily  see  that  this  could  cause  problems  for  both  Alice,  Bob  and  Diana. 

A  more  fundamental  problem  with  our  second  protocol  attempt  is  that  there  is  a  man-in-the- 
middle  attack.  In  the  following  message  flows,  the  attacker  Eve  (E),  masquerades  as  both  the  TTP 
and  Bob  to  Alice,  and  so  is  able  to  learn  the  key  that  Alice  thinks  she  is  using  to  communicate 
solely  with  Bob. 


A 

-A 

E 

A,B, 

E  - 

s 

A,E, 

s 

-A 

E 

{Kae\Kes 

{A"ae}iCas  5 

E 

-A 

A 

{ Kae}Kes 

{Kae\ Kas  •> 

A 

— > 

E 

{  Aa,e}iCes 

,A. 

To  get  around  these  two  problems  we  get  the  TTP  S  to  encrypt  the  identity  components,  thus 
preventing  both  of  the  above  problems.  Thus  our  third  attempt  at  a  protocol  is  given  by  Figure  18.5 
and  the  following  message  flows: 

-A  A  :  A, 


A 

S 

A 


A  :  { Kab ,  A}Kbs ,  { Kab ,  B}Ki 
->  B  :  {Kab,  A}Kbs. 


as  "> 
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This  protocol  however  suffers  from  a  replay  attack. 


Assume  the  attacker  can  determine  the  key 


Figure  18.5.  Protocol  Version  3 

Kab  used  in  an  old  run  of  the  protocol;  this  might  be  because  users  and  systems  are  often  not  as 
careful  with  respect  to  keys  meant  for  short-term  use  (after  all  they  are  for  short-term  use!).  She 
can  then  masquerade  as  the  TTP  and  deliver  the  same  responses  to  a  new  run  of  a  protocol  as  to 
the  old  run,  and  she  can  do  this  without  needing  to  know  either  Kas  or  K bs. 

The  basic  problem  is  that  Alice  does  not  know  whether  the  key  she  receives  is  fresh.  In  the 
Wide-Mouth  Frog  protocol  we  guaranteed  freshness  by  the  use  of  synchronized  clocks;  we  are 
trying  to  avoid  them  in  the  design  of  this  protocol.  We  avoid  the  need  to  use  synchronized  clocks 
by  instead  utilizing  nonces.  This  leads  us  to  our  fourth  protocol  attempt,  given  by  Figure  18.6  and 
the  following  message  flows: 

A  — »  S  :  A,B,Na, 

S  — >  A  :  {Na,  B,  Kab,  {Kab,  A}Kbs]Kas , 

A  — »  B  :  {Kab,A}Kb3, 


By  Alice  checking  the  nonce  Na  received  in  the  response  from  S  is  the  same  as  that  sent  in  the  initial 
message,  Alice  knows  that  the  key  K ab  will  be  fresh;  assuming  that  S  is  following  the  protocol.  In 
this  fourth  variant  Bob  knows  that  Alice  once  was  alive  and  sent  the  same  ephemeral  symmetric 
key  Kab  as  he  just  received.  But  this  could  have  been  a  long  while  in  the  past.  Similarly,  Alice 
knows  that  only  Bob  could  have  access  to  Kab,  but  she  does  not  know  whether  he  does  or  not. 
Thus  our  last  modification  is  to  add  a  key  confirmation  step,  which  assures  both  Alice  and  Bob 
that  they  are  both  alive  and  are  able  to  use  the  key.  This  is  achieved  by  adding  an  encrypted  nonce 
sent  by  Bob,  which  Alice  decrypts,  modifies  and  sends  back  to  Bob  encrypted.  With  this  change 
we  get  the  Needham-Schroeder  protocol,  see  Figure  18.7,  and  the  message  flows: 


A  - 

—>  S 

A,  B,  Na, 

S 

-►  A 

{Na,  B ,  Kab ,  { Kab ,  A}Kbs} 

A 

— >  B 

{Kab,A}Kbs, 

B 

-►  A 

{Nb}xab, 

A 

— >  B 

{Nb  - 1  }xab- 

We  now  look  again  at  each  message  in  detail,  and  summarize  what  it  achieves: 
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2: 


Bob 

3:  {Kab,A}Khs 

Alice 

Figure  18.6.  Protocol  Version  4 

•  The  first  message  tells  S  that  Alice  wants  a  key  to  communicate  with  Bob. 

•  In  the  second  message  S  generates  the  session  key  Kab  and  sends  it  back  to  Alice.  The 
nonce  Na  is  included  so  that  Alice  knows  this  was  sent  after  her  request  of  the  first  message. 
The  session  key  is  also  encrypted  under  the  key  Kbs  for  sending  to  Bob. 

•  The  third  message  conveys  the  session  key  to  Bob. 

•  Bob  needs  to  check  that  the  third  message  was  not  a  replay.  So  he  needs  to  know  whether 
Alice  is  still  alive;  hence,  in  the  fourth  message  he  encrypts  a  nonce  back  to  Alice. 

•  In  the  final  message,  to  prove  to  Bob  that  she  is  still  alive,  Alice  encrypts  a  simple  function 
of  Bob’s  nonce  back  to  Bob. 


Bob 


Figure  18.7.  Needham-Schroeder  protocol 


The  main  problem  with  the  Needham-Schroeder  protocol  is  that  Bob  does  not  know  that  the  key 
he  shares  with  Alice  is  fresh,  a  fact  which  was  not  spotted  until  some  time  after  the  original  protocol 
was  published.  An  adversary  who  finds  an  old  session  transcript  can,  after  finding  the  old  session 
key  by  some  other  means,  use  the  old  session  transcript  in  the  last  three  messages  involving  Bob. 
Hence,  the  adversary  can  get  Bob  to  agree  to  a  key  with  the  adversary,  which  Bob  thinks  he  is 
sharing  with  Alice. 
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Note  that  Alice  and  Bob  have  their  secret  session  key  generated  by  the  TTP  and  so  neither  party 
needs  to  trust  the  other  to  produce  “good”  keys.  They  of  course  trust  the  TTP  to  generate  good 
keys  since  the  TTP  is  an  authority  trusted  by  everyone.  In  some  applications  this  last  assumption 
is  not  valid  and  more  involved  algorithms,  or  public  key  algorithms,  are  required. 

18.3.3.  Kerberos:  We  end  this  section  by  looking  at  Kerberos.  Kerberos  is  an  authentication 
system  based  on  symmetric  encryption,  with  keys  shared  with  an  authentication  server;  it  is  based 
on  ideas  underlying  the  Needham-Schroeder  protocol.  Kerberos  was  developed  at  MIT  around 
1987  as  part  of  Project  Athena.  A  modified  version  of  this  original  version  of  Kerberos  is  now  used 
in  many  versions  of  the  Windows  operating  system,  and  in  many  other  systems. 

The  network  is  assumed  to  consist  of  clients  and  a  server,  where  the  clients  may  be  users, 
programs  or  services.  Kerberos  keeps  a  central  database  of  clients  including  a  secret  key  for  each 
client,  hence  Kerberos  requires  a  key  space  of  size  0{n)  if  we  have  n  clients.  Kerberos  is  used  to 
provide  authentication  of  one  entity  to  another  and  to  issue  session  keys  to  these  entities. 

In  addition  Kerberos  can  run  a  ticket- granting  system  to  enable  access  control  to  services  and 
resources.  This  division  mirrors  what  happens  in  real  companies.  For  example,  in  a  company  the 
personnel  department  administers  who  you  are,  whilst  the  computer  department  administers  what 
resources  you  can  use.  This  division  is  also  echoed  in  Kerberos  with  an  authentication  server  and 
a  ticket  generation  server  TGS.  The  TGS  grants  tickets  to  enable  users  to  access  resources,  such 
as  hies,  printers,  etc. 


Suppose  A  wishes  to  access  a  resource  B.  First  A  logs  in  to  the  authentication  server  using  a 
password.  The  user  A  is  given  a  ticket  from  this  server  encrypted  under  her  password.  This  ticket 
contains  a  session  key  Kas.  She  now  uses  Kas  to  obtain  a  ticket  from  the  TGS  S  to  access  the 
resource  B.  The  output  of  the  TGS  is  a  key  K ^  ,  a  timestamp  Ts  and  a  lifetime  L.  The  output 
of  the  TGS  is  used  to  authenticate  A  in  subsequent  traffic  with  B.  The  hows  look  something  like 
those  given  in  Figure  18.8, 

A  — >  S  :  A,  B, 

S  — A  :  {Ts,  L,  Kab ,  B,  {Ts,  L,  Kab,  A}xba}Kas, 

A  — >  B  :  {Ts,L,Kab,A}Kbs,{A,TA}Kab, 

B  *  A  :  {Ta  +  1  }Kab- 

Again  we  describe  what  each  message  flow  is  trying  to  achieve: 
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•  The  first  message  is  A  telling  S  that  she  wants  to  access  B. 

•  If  S  allows  this  access  then  a  ticket  {T^,  L,  i^a&,  A}  is  created.  This  is  encrypted  under 
Kbs  and  sent  to  A  for  forwarding  to  B.  The  user  A  also  gets  a  copy  of  the  key  in  a  form 
readable  by  her. 

•  The  user  A  wants  to  verify  that  the  ticket  is  valid  and  that  the  resource  B  is  alive.  Hence, 
she  sends  an  encrypted  nonce/timestamp  Ta  to  B. 

•  The  resource  B  sends  back  the  encryption  of  Ta  +  1,  after  checking  that  the  timestamp 
Ta  is  recent,  thus  proving  that  he  knows  the  key  and  is  alive. 

Thus  we  have  removed  the  problems  associated  with  the  Needham-Schroeder  protocol  by  using 
timestamps,  but  this  has  created  a  requirement  for  synchronized  clocks. 

18.4.  Fresh  Ephemeral  Symmetric  Keys  from  Static  Public  Keys 

Recall  that  the  main  drawback  with  the  use  of  fast  bulk  encryption  based  on  block  or  stream  ciphers 
was  the  problem  of  key  distribution.  We  have  already  seen  a  number  of  techniques  to  solve  this 
problem,  using  protocols  which  are  themselves  based  on  symmetric  key  techniques.  These,  however, 
also  have  problems  associated  with  them.  For  example,  the  symmetric  key  protocols  required  the 
use  of  already  deployed  long-term  keys  between  each  user  and  a  trusted  central  authority.  In  this 
section  we  look  at  two  public- key-based  techniques.  The  first,  called  key  transport ,  uses  public 
key  encryption  to  transmit  a  symmetric  key  from  one  user  to  the  other;  the  second,  called  key 
agreement ,  is  a  protocol  which  as  output  produces  a  symmetric  key,  and  which  uses  public  key 
signatures  to  authenticate  the  parties. 

18.4.1.  Key  Transport:  Let  (ep*,  d5i)  be  a  public  key  encryption  scheme  with  public/private  key 
pair  associated  with  a  user  Bob,  via  a  certificate.  Suppose  Alice  wants  to  send  a  symmetric 

key  over  to  Bob,  she  first  looks  up  Bob’s  public  key  in  a  directory,  she  generates  the  required 
symmetric  key  Kab  from  the  required  space  and  then  encrypts  this  and  sends  it  to  Bob.  This  is  a 
very  simple  scheme  whose  flow  is  given  pictorially  in  Figure  18.9  and  in  symbols  by 

A  y  B  .  e^i^Kdb). 


Alice 

&pt\lAab) 

Bob 

Figure  18.9.  Public- key-based  key  transport:  Version  1 

This  protocol  is  very  much  like  the  first  part  of  a  hybrid  encryption  scheme.  However,  it  does 
not  achieve  all  that  we  require.  Firstly,  whilst  Alice  knows  that  only  Bob  can  decrypt  the  ciphertext 
to  obtain  the  new  key  Ka5,  Bob  does  not  know  that  the  ciphertext  came  from  Alice.  So  Bob  does 
not  know  to  whom  the  key  Kab  should  be  associated.  One  way  around  this  is  for  Alice  to  also 
append  a  digital  signature  to  the  ciphertext.  So  now  we  have  two  public/private  key  pairs:  a 
signature  pair  for  Alice  (pi \a^^a)  for  some  public  key  signature  algorithm  Sig,  and  an  encryption 
pair  for  Bob  for  some  encryption  algorithm  epg  .  The  resulting  protocol  flow  is  given  in 

Figure  18.10.  However,  this  protocol  suffers  from  another  weakness;  we  can  get  Bob  to  think  he  is 
sharing  a  key  with  Eve,  when  he  is  actually  sharing  a  key  with  Alice.  The  attack  goes  as  follows: 

=  eptB(Kab),S^tA(c) 

=  eptB{Kab),S\gstE(c). 


A  — y  E  :  c  : 
E  — >  B  :  c  : 
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Alice 

c  :=  eptB(Kab),S\g5zA(c) 

Bob 

Figure  18.10.  Public-key  based  key  transport:  Version  2 


This  works  because  the  identity  of  the  sender  is  not  bound  to  the  ciphertext.  Following  the  lead 
from  our  examples  in  Section  18.3,  the  simple  solution  to  this  problem  is  to  encrypt  the  sender’s 
identity  along  with  the  transmitted  key,  as  in  Figure  18.11.  But  even  this  cannot  be  considered 
secure  due  to  a  replay  attack  which  we  will  outline  later. 


Alice 

c  :=  epzB(A\\Kab),S\g5tA(c) 

Bob 

Figure  18.11.  Public-key  based  key  transport:  Version  3 


Forward  Secrecy:  All  of  the  systems  based  on  symmetric  key  encryption  given  in  Section  18.3, 
and  the  previous  method  of  key  transport  using  public  key-encryption-based  key  transport  are 
not  forward  secure.  A  system  is  said  to  have  forward  secrecy  if  compromising  of  a  long-term 
private  key,  i.e.  bIb  in  the  above  protocol,  at  some  point,  in  the  future  does  not  compromise  the 
security  of  communications  made  using  that  key  in  the  past.  Notice  in  Figure  18.11  that  if  the 
recipient’s  private  key  b\b  is  compromised  in  the  future,  then  all  communications  in  the  past  are 
also  compromised. 

In  addition  using  key  transport  implies  that  the  recipient  trusts  the  sender  to  be  able  to  generate, 
in  a  sensible  way,  the  session  key.  Sometimes  the  recipient  may  wish  to  contribute  some  randomness 
of  their  own  to  the  session  key.  However,  this  can  only  be  done  if  both  parties  are  online  at  the 
same  moment  in  time.  Key  transport  is  thus  more  suited  to  the  case  where  only  the  sender  is 
online,  as  in  applications  like  email,  for  example. 

The  idea  of  both  parties  contributing  to  the  entropy  of  the  session  key  not  only  aids  in  creating 
perfectly  secure  schemes,  it  also  avoids  the  replay  attack  we  have  on  our  key  transport  protocol. 
The  next  set  of  protocols,  based  on  DifRe-Hellman  key  exchange,  does  indeed  contribute  entropy 
from  both  sides. 

18.4.2.  DifRe-Hellman  Key  Exchange:  To  avoid  the  fact  that  key  transport  based  on  public 
key  encryption  is  not  forward  secure,  and  the  problem  of  one  party  generating  the  key,  the  modern 
way  of  using  public  key  techniques  to  create  symmetric  keys  between  two  parties  is  based  on  a 
process  called  key  exchange. 

Key  exchange  was  introduced  in  the  same  seminal  paper  by  DifRe  and  Heilman  in  which  they 
introduced  public  key  cryptography.  Their  protocol  for  key  distribution,  called  Diffie- Heilman  key 
exchange ,  allows  two  parties  to  agree  a  secret  key  over  an  insecure  channel  without  having  met 
before.  Its  security  is  based  on  the  discrete  logarithm  problem  in  a  finite  abelian  group  G  of  prime 
order  q.  In  the  original  paper  the  group  is  taken  to  be  a  subgroup  G  of  F*,  but  now  more  efficient 
versions  can  be  produced  by  taking  G  to  be  a  subgroup  of  an  elliptic  curve,  in  which  case  the 
protocol  is  called  EC-DH. 

In  Diffie-Hellman  the  two  parties  each  have  their  own  ephemeral  secrets  a  and  6,  which  are 
elements  in  the  group  Z/gZ.  The  basic  message  flows  for  the  Diffie-Hellman  protocol  are  given  by 
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Alice 

g 

Bob 

9b 

- 

Figure  18.12.  Diffie-Hellman  key  exchange 

Figure  18.12  and  the  following  notational  representation: 

A  — »  B  :  zIa  =  ga , 

B  — »  A  :  ztB  =  gb • 

From  these  exchanged  ephemeral  key  values,  At  a  and  elffi,  and  their  respective  ephemeral  secret 
keys,  a  and  6,  both  parties  can  agree  on  the  same  secret  session  key: 

•  Alice  can  compute  K  4—  ztBa  =  (gb)a ,  since  she  knows  a  and  was  sent  ztB  —  gb  by  Bob, 

•  Bob  can  also  compute  K  4—  z$Ab  =  (ga)b ,  since  he  knows  b  and  was  sent  z%a  =  ga  by  Alice. 

Eve,  the  attacker,  can  see  the  messages  ga  and  gb  and  then  needs  to  recover  the  secret  key  K  =  ga'b 
which  is  exactly  the  Diffie-Hellman  problem  considered  in  Chapter  3.  Hence,  the  security  of  the 
above  protocol  rests  not  on  the  difficulty  of  solving  the  discrete  logarithm  problem,  DLP,  but  on 
the  difficulty  of  solving  the  Diffie-Hellman  problem,  DHP.  Recall  that  it  may  be  the  case  that  it  is 
easier  to  solve  the  DHP  than  the  DLP,  although  no  one  believes  this  to  be  true  for  the  groups  that 
are  currently  used  in  real-life  protocols. 

In  practice  one  does  not  want  the  agreed  key  to  be  an  element  of  a  group,  one  requires  it  to  be 
a  bit  string  of  a  given  length.  Hence,  in  real  systems  one  applies  a  key  derivation  function  to  the 
agreed  secret  group  element,  to  obtain  the  required  key  for  future  use.  It  turns  out  that  not  only 
is  this  operation  important  for  functional  reasons,  but  in  addition  when  we  model  the  KDF  as  a 
random  oracle  one  can  prove  the  security  of  Diffie-Hellman-related  protocols  in  the  random  oracle 
model.  Thus  we  almost  always  use  the  derived  key  k  =  H(K),  for  some  random  oracle  H. 

Notice  that  the  Diffie-Hellman  protocol  can  be  performed  both  online  (in  which  case  both 
parties  contribute  to  the  randomness  in  the  shared  session  key)  or  offline,  where  one  of  the  parties 
uses  a  long-term  key  of  the  form  ga  instead  of  an  ephemeral  key.  Hence,  the  Diffie-Hellman  protocol 
can  be  used  as  a  key  exchange  or  as  a  key  transport  protocol.  We  shall  focus  however  on  its  use  as 
a  key  exchange  protocol. 

Finite  Field  Example:  The  following  is  a  very  small  example;  we  let  the  domain  parameters  be 
given  by 

p  =  2  147  483  659,  q  =  2  402  107,  and  g  =  509  190  093, 

but  in  real  life  one  would  take  p  ~  22048.  Note  that  g  has  prime  order  q  in  the  held  ¥p.  The 
following  diagram  indicates  a  possible  message  how  for  the  Diffie-Hellman  protocol: 

Alice  Bob 

a  =  12  345  b  =  654  323, 

el l  A  =  ga  =  382  909  757  — >  ztA  =  382  909  757, 
zlB  =  1  190416419  4 —  zlB  =  gb  =  1  190416419. 

The  shared  secret  group  element  is  then  computed  via 

ciAb  =  382  909  75 7654 323  (mod  p )  =  881 311  606, 

dBa  =  1  190  416  4  1  912  345  (mod  p )  =  881311606, 

with  the  actual  secret  key  being  given  by  k  =  77(881311606)  for  some  KDF  77",  which  we  model 
as  a  random  oracle. 
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Notice  that  group  elements  are  transmitted  in  the  protocol,  hence  when  using  a  finite  held 
such  as  F*  for  the  DifRe-Hellman  protocol  the  communication  costs  are  around  2048  bits  in  each 
direction,  since  it  is  prudent  to  choose  p  ~  22048.  However,  when  one  uses  an  elliptic  curve  group 
E(Fq)  one  can  choose  q  ~  2256,  and  so  the  communication  costs  are  much  less,  namely  around  256 
bits  in  each  direction.  In  addition  the  group  exponentiation  step  for  elliptic  curves  can  be  done 
more  efficiently  than  that  for  finite  prime  fields. 

Elliptic  Curve  Example:  As  a  baby  example  of  EC-DH  consider  the  elliptic  curve 

E  :Y2  =  X3  +X  -  3 

over  the  held  Fi99.  Let  the  base  point  be  given  by  G  =  (1,76),  which  has  prime  order  q  = 
#E(Fi99)  =  197.  Then  a  possible  EC-DH  message  how  is  given  by 

Alice  Bob 

a  —  23  b  =  86, 

zlA  =  [a\G=  (2,150)  — »  tlA  =  (2,150), 

zlB  =  (123, 187)  4 —  zlB  =  [b]G  =  (123, 187). 

The  shared  secret  key  is  then  computed  via 

[b]ztA  =  [86] (2, 150)  =  (156,  75), 

[a]zlB  =  [23]  (123, 187)  =  (156,75). 

The  shared  key  is  then  usually  taken  to  be  the  x-coordinate  156  of  the  computed  point.  In  addition, 
instead  of  transmitting  the  points,  we  transmit  the  compression  of  the  point,  which  results  in  a 
signihcant  saving  in  bandwidth. 

18.4.3.  Signed  Diffie— Heilman:  So  we  seem  to  have  solved  the  key  distribution  problem.  But 
there  is  an  important  problem:  you  need  to  be  careful  who  you  are  agreeing  a  key  with.  Alice  has 
no  assurance  that  she  is  agreeing  a  key  with  Bob,  which  can  lead  to  the  following  man-in-the-middle 
attack: 


Alice 

Eve 

Bob 

a  — > 

9a , 

gm  i — 

m, 

gdrri 

nam 

y  1 

n 

— »  gn, 

9b 

<—  b, 

gbn 

nbn 

y 

In  the  man-in-the-middle  attack 

•  Alice  agrees  a  key  with  Eve,  thinking  it  is  Bob  with  whom  she  is  agreeing  a  key  with, 

•  Bob  agrees  a  key  with  Eve,  thinking  it  is  Alice, 

•  Eve  can  now  examine  communications  as  they  pass  through  her  i.e.  she  acts  as  a  router. 
She  does  not  alter  the  plaintext,  so  her  actions  go  undetected. 

So  we  can  conclude  that  the  Diffie-Hellman  protocol  on  its  own  is  not  enough.  For  example  how 
does  Alice  know  with  whom  she  is  agreeing  a  key?  Is  it  Bob  or  Eve?  One  way  around  the  man-in- 
the-middle  attack  on  the  Diffie-Hellman  protocol  is  for  Alice  to  sign  her  message  to  Bob  and  Bob 
to  sign  his  message  to  Alice.  In  this  way  both  parties  know  who  they  are  talking  to.  This  produces 
the  protocol  called  signed- Diffie-Hellman  given  in  Figure  18.13  and  with  message  flows: 
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etm  =  #  ,:> \gsiA\itA) 

Alice 

&B  =  gb,  Sig sIb(&b) 

- 

Bob 

Figure  18.13.  Signed  Diffie-Hellman  key  exchange 


A  — >  B  :  etA  =  ga,SigstA(dA), 

B  — >  A:  dB  =  gb,Sig5tB(dB). 

The  problem  is  that  we  again  have  the  attack  of  stripping  the  signature  from  Alice’s  message, 
replacing  it  with  Eve’s;  then  Bob  will  think  he  shares  a  key  with  Eve,  whereas  actually  he  shares  a 
key  with  Alice.  In  Figure  18.11  this  was  solved  by  encrypting  the  identity  of  the  sender,  so  that  it 
could  not  be  tampered  with.  However,  in  DifEe-Hellman  key  exchange  there  is  no  encryption  used 
into  which  we  can  embed  the  identity. 

18.4.4.  Station-to-Station  Protocol:  To  get  around  this  latter  problem  the  following  protocol 
was  invented,  called  the  station-to-station  (STS)  protocol.  The  original  presentation  dates  back  to 
1987.  The  basic  idea  is  that  the  two  parties  encrypt  their  signatures  using  some  symmetric  key 
encryption  algorithm  (e^,i4),  and  the  key,  k  <—  H(ga’b ),  derived  from  the  key  exchange  protocol. 
In  particular  this  means  that  the  initiator’s,  in  our  case  Alice’s,  signature  needs  to  be  sent  in  a 
third  message  flow.  Thus  we  have  the  flows: 

A  — »  B  :  ztA  =  ga , 

B  — >  A  :  efiB  =  gh,ek  (S\gsiB(dB,  ttA))  where  k  4-  H(etAb), 

A  — »  B  :  ek  (Sigstu(T4,  dB))  where  k  4-  H(dBa). 


Notice  that  the  messages  signed  by  each  party  have  the  group  elements  in  different  orders.  Another 
variant  on  the  STS  protocol  is  for  the  parties  to  authenticate  their  signatures  using  a  MAC  function 
(Mac*,/,  Verify*,/).  In  this  variant  the  key  derivation  function  is  used  to  derive  a  key  k  to  use  in  the 
following  protocol  (for  which  we  require  key  agreement  scheme),  and  a  separate  key  k'  to  perform 
the  authentication  of  the  signature  values.  The  reason  for  this  is  to  provide  a  clear  separation 
between  k  and  the  protocol  used  to  derive  it.  So  in  this  variant  of  the  STS  protocol  the  message 
flows  become: 


A  — >  B  :  ttA  —  ga , 

B  — »  A  :  z\b  =  gb ,  $b  '=  Sig sib(&b,  A U),  Mac^  (Sb) 
A  — >  B  :  SA  :=  Sig5eA(eC4,  ets),  Macfc/  ( SA ) 


where  k\\k'  H(ztAb) 
where  k\\k'  H{ztBa) 


18.4.5.  Blake-Wilson— Menezes  Protocol:  One  can  ask  whether  one  can  obtain  authentication 
without  the  need  for  signatures  and/or  MACs  as  in  the  station-to-station  protocol,  and  without 
the  need  for  additional  data  to  be  sent,  or  the  additional  third  message  flow.  The  answer  to  all  of 
these  questions  is  yes ,  as  the  following  protocol,  due  to  Blake-Wilson  and  Menezes,  and  the  MQV 
protocol  of  the  next  section  will  show. 

Originally  the  Diffie-Hellman  protocol  was  presented  as  a  way  to  provide  authentic  keys  given 
shared  static  public  keys  ptA  =  ga  and  ptB  =  gh -  In  other  words  these  values  were  not  exchanged, 
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but  used  to  produce  a  static  authenticated  shared  symmetric  key.  Then  people  realized  by  exchang¬ 
ing  ephemeral  versions  of  such  keys  one  could  obtain  new  symmetric  keys  for  each  iteration.  It  turns 
out  that  we  can  combine  both  the  static  and  the  public  variants  of  DifRe-Hellman  key  exchange  to 
obtain  an  authenticated  key  agreement  protocol,  without  the  need  for  digital  signatures. 

To  do  this  we  assume  that  Alice  has  a  long-term  static  public/private  key  pair  given  by  {pi  a  = 
ga,a),  and  Bob  has  a  similar  long-term  public/private  key  pair  given  by  {ptB  =  gh,b).  We  first 
assume  that  Alice  has  obtained  an  authentic  version  of  Bob’s  public  key  say  via  a  digital 
certificate,  and  vice  versa.  We  can  then  obtain  an  authenticated  key  agreement  protocol  using  only 
two  message  flows,  as  follows: 

A  — »  B  :  At A  =  gx , 

B  — »  A  :  ciB  =  gy . 

Notice  that  the  message  flows  are  identical  to  the  original  DifRe-Hellman  protocol.  The  key  differ¬ 
ence  is  in  how  the  shared  secret  key  k  is  derived.  Alice  derives  it  via  the  equation 

k  <-  H(ptBx,etBa)  =  H(gb  x,9y  a)- 

The  same  key  is  derived  by  Bob  using  the  equation 

k^H(ttAb,9tAv)  =  H(grb,sr). 


18.4.6.  MQV  Protocol:  The  only  problem  with  the  previous  protocol  is  that  one  must  perform 
three  exponentiations  per  key  agreement  per  party.  Alice  needs  to  compute  ef l  a  =  gx ,  tiBx  and 
ptBa.  This  led  Menezes,  Qu  and  Vanstone  to  invent  the  following  protocol,  called  the  MQV  protocol. 
Once  again,  being  based  on  the  DifRe-Hellman  protocol,  security  is  based  on  the  discrete  logarithm 
problem  in  a  group  G  generated  by  g.  Like  the  Blake-Wilson-Menezes  protocol,  MQV  works  by 
assuming  that  both  parties,  Alice  and  Bob,  first  generate  long-term  public/private  key  pairs  which 
we  shall  denote  by  {pi  a  =  ga,a)  and  {piB  =  gb->b).  Again,  we  shall  assume  that  Bob  knows  that 
piA  is  the  authentic  public  key  belonging  to  Alice  and  that  Alice  knows  that  piB  is  the  authentic 
public  key  belonging  to  Bob. 

Assume  Alice  and  Bob  now  want  to  agree  on  a  secret  session  key;  they  execute  the  same  message 
flows  as  the  Blake-Wilson-Menezes  protocol,  namely: 

A  — »  B  :  eiA  =  gx , 

B  — »  A  :  eiB  —  gy . 

So  this  does  not  look  much  different  to  the  standard  un-authenticated  DifRe-Hellman  protocol  or 
the  Blake-Wilson-Menezes  protocol.  However,  the  key  difference  is  in  how  the  final  session  key  is 
created  from  the  relevant  values. 

Assume  you  are  Alice,  so  you  know 

ptA,  piB,  &  and  x. 

Let  l  denote  half  the  bit  size  of  the  order  of  the  group  G,  for  example  if  we  are  using  a  group  with 
order  q  2256  then  we  set  l  =  256/2  =  128.  To  determine  the  session  key,  Alice  now  computes 

(1)  Convert  At  a  to  an  integer  i. 

(2)  sa  (i  (mod  21))  +  2l. 

(3)  Convert  AtB  to  an  integer  j. 

(4)  tA  (j  (mod  2*))  +  2l. 

(5)  tiA  x  +  sA  *  cl  (mod  q). 

(6)  KA<-(ttB-ptBtA)hA- 
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Notice  how  st  and  tA  arc  exactly  l  bits  in  length  and,  assuming  the  conversion  of  a  group  element 
to  an  integer  results  in  a  “random- looking”  integer,  these  values  will  also  behave  like  random  l- bit 
integers.  Thus  Alice  needs  to  compute  one  full  exponentiation,  to  produce  elU,  and  one  multi¬ 
exponentiation  to  produce  Ka- 

Bob  runs  exactly  the  same  calculation  but  with  the  roles  of  the  public  and  private  keys  swapped 
around  in  the  obvious  manner,  namely: 

(1)  Convert  zIb  to  an  integer  i. 

(2)  sb  <—  (i  (mod  21))  +  2l. 

(3)  Convert  el l  a  to  an  integer  j. 

(4)  ts  <—  (j  (mod  21))  +  2l . 

(5)  Hb  <—  V  +  sb  •  b  (mod  q). 

(6)  ptAtB)hB- 

Then  Ka  =  Kb  is  the  shared  secret.  To  see  why  the  Ka  computed  by  Alice  and  the  Kb  computed 

by  Bob  are  the  same  we  notice  that  the  and  tA  seen  by  Alice,  are  swapped  when  seen  by  Bob, 

i.e.  sa  =  ts  and  sb  =  t a-  We  see  that 

^ogg(KA)  =  dlogg  ([ztB  *  P lBtA)hA)  =  (y  +  b  •  tA)  -hA 

=  y  •  (x  +  sa  -  a)  +  b  •  tA  -  (x  +  sa  -  a)  =  y  •  (x  +  ts  •  a)  +  b  •  sb  -  (c  +  ts  -  a) 

=  x  •  (y  +  sb  •  b)  +  a  •  ts  •  (d  +  sb  •  b)  =  (x  +  a  •  ts)  •  Hb 

=  dlogff  ((eO  •  P Cl)hB)  =  dlog g(KB). 


18.5.  The  Symbolic  Method  of  Protocol  Analysis 

One  can  see  that  the  above  protocols  are  very  intricate;  spotting  flaws  in  them  can  be  a  very  subtle 
business.  A  number  of  different  approaches  have  been  proposed  to  try  and  make  the  design  of 
these  protocols  more  scientific.  The  first  school  is  based  on  so-called  formal  methods,  and  treats 
protocols  via  means  of  a  symbolic  algebra.  The  second  school  is  closer  to  our  earlier  modelling  of 
encryption  and  signature  schemes,  in  that  it  is  based  on  a  cryptographic  game  between  a  challenger 
and  an  adversary. 

The  most  influential  of  the  methods  in  the  first  school  is  the  BAN  logic  invented  by  Burrows, 
Abadi  and  Needham.  The  BAN  logic  has  a  large  number  of  drawbacks  compared  to  more  modern 
logical  analysis  tools,  but  was  very  influential  in  the  design  and  analysis  of  symmetric- key-based 
key  agreement  protocols  such  as  Kerberos  and  the  Needham-Schroeder  protocol.  It  has  now  been 
supplanted  by  more  complicated  logics  and  formal  methods,  but  it  is  of  historical  importance  and 
the  study  of  the  BAN  logic  can  still  be  very  instructive  for  protocol  designers.  The  main  benefit  in 
using  symbolic  methods  and  logics  is  that  the  analysis  can  usually  be  semi-automated  via  theorem 
provers  and  the  like;  this  should  be  compared  to  the  cryptographic  analysis  method  which  is  still 
done  mainly  by  hand. 

First  we  really  need  to  pause  and  decide  what  are  the  goals  of  key  agreement  and  key  transport, 
and  what  position  the  parties  start  from.  In  the  symmetric  key  setting  we  assume  all  parties,  A 
and  B  say,  only  share  secret  keys  Kas  and  K ^  with  the  trusted  third  party  S.  In  the  public  key 
setting  we  assume  that  all  parties  have  a  public/private  key  pair  (plf^sl !^),  and  that  the  public 
key  ptA  is  bound  to  an  entity’s  identity  A,  via  some  form  of  certificate. 

In  both  cases  parties  A  and  B  want  to  agree  and/or  transport  a  symmetric  session  key  K ^ 
for  use  in  some  further  protocol.  This  new  session  key  should  be  fresh,  i.e.  it  has  not  been  used 
by  any  other  party  before  and  has  been  recently  created.  The  freshness  property  will  stop  attacks 
whereby  the  adversary  replays  messages  so  as  to  use  an  old  key  again.  Freshness  can  also  be  useful 
in  deducing  that  the  party  with  which  you  are  communicating  is  still  alive. 
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We  also  need  to  decide  what  capabilities  an  attacker  has.  As  always  we  assume  the  worst 
possible  situation  in  which  an  attacker  can  intercept  any  message  flow  over  the  network.  She  can 
then  stop  a  message,  alter  it  or  change  its  destination.  An  attacker  is  also  able  to  distribute  her 
own  messages  over  the  network.  With  such  a  high-powered  attacker  it  is  often  assumed  that  the 
attacker  is  the  network. 

The  main  idea  of  BAN  logic  is  that  one  should  concentrate  on  what  the  parties  believe  is 
happening.  It  does  not  matter  what  is  actually  happening;  we  need  to  understand  exactly  what 
each  party  can  logically  deduce,  from  its  own  view  of  the  protocol,  about  what  is  actually  happening. 

In  the  BAN  logic,  complex  statements  are  made  up  of  some  basic  atomic  statements  which 
are  either  true  or  false.  These  atomic  statements  can  be  combined  into  more  complex  ones  using 
conjunction,  which  is  denoted  by  a  comma.  The  basic  atomic  statements  are  given  by: 


P  |  =A  means  P  believes  (or  is  entitled  to  believe)  X. 

The  principal  P  may  act  as  though  X  is  true. 

P<\X  means  P  sees  X. 

Someone  has  sent  a  message  to  P  containing  A,  so  P  can  now  read  and  repeat  X. 
P |  ~ X  means  P  once  said  X  and  P  believed  X  when  it  was  said. 

Note  this  tells  us  nothing  about  whether  X  was  said  recently  or  in  the  distant  past. 
P |  =>A  means  P  has  jurisdiction  over  X. 

This  means  P  is  an  authority  on  X  and  should  be  trusted  on  this  matter. 
jfX  means  the  formula  X  is  fresh. 

This  is  usually  used  for  nonces. 


•  P  AT  Q  means  P  and  Q  may  use  the  shared  key  K  to  communicate. 

The  key  is  assumed  good  and  it  will  never  be  discovered  by  anyone  other  than  P  and  Q, 
unless  the  protocol  itself  makes  this  happen. 

•  {X}k,  means  as  usual  that  X  is  encrypted  under  the  key  K. 

The  encryption  is  assumed  to  be  perfect  in  that  X  will  remain  secret  unless  deliberately 
disclosed  by  a  party  at  some  other  point  in  the  protocol. 


We  start  with  a  set  of  statements  which  are  assumed  to  be  true  at  the  start  of  the  protocol.  When 
executing  the  protocol  we  infer  the  truth  of  new  statements  via  the  BAN  logic  postulates,  or  rules 
of  inference.  The  format  we  use  to  specify  rules  of  inference  is  as  follows: 


A,  B 
~C~ 


which  means  that  if  A  and  B  are  true  then  we  can  conclude  C  is  also  true.  This  is  a  standard 
notation  used  in  many  areas  of  logic  within  computer  science. 

•  Message  Meaning  Rule: 


A 


=A  o  B,A<s{X}k 


A 


=B  I 


In  words,  if  both 

—  A  believes  she  shares  the  key  K  with  R, 

—  A  sees  X  encrypted  under  the  key  A, 

we  can  deduce  that  A  believes  that  B  once  said  X.  Note  that  this  implicitly  assumes  that 
A  never  said  X. 

•  Nonce  Verification  Rule: 


A 

=#X,A 

=B 

-A 

A  |  =B 

=X 

In  words,  if  both 

—  A  believes  X  is  fresh  (i.e.  recent), 
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—  A  believes  B  once  said  X , 

then  we  can  deduce  that  A  believes  that  B  still  believes  X . 

Jurisdiction  Rule: 


A 

=B 

=>X,A 

=B 

=X 

A 

=x 

In  words,  if  both 

—  A  believes  B  has  jurisdiction  over  X,  i.e.  A  trusts  B  on  X , 

—  A  believes  B  believes  X, 

then  we  conclude  that  A  also  believes  X. 

Other  Rules:  The  belief  operator  and  conjunction  can  be  manipulated  as  follows: 


P 

=X,P 

=Y  P 

-<X,Y)  P 

■<X,Y) 

P 

=(X,Y)  ’ 

P 

=X  ’ 

P 

— 

Q 

=x 

A  similar  rule  also  applies  to  the  “once  said”  operator 


p 

--Q 

~(V  Y) 

p 

Q 

-A 

Note  that  P\  =Q\  ~ X  and  P\  =Q\  ~ Y  does  not  imply  P\  =Q\  ~(X,  Y),  since  that  would 
imply  X  and  Y  were  said  at  the  same  time.  Finally,  if  part  of  a  formula  is  fresh  then  so 
is  the  whole  formula 

P\=#X 


p  I  =#AT) 

We  wish  to  analyse  a  key  agreement  protocol  between  A  and  B  using  the  BAN  logic.  But  what  is 
the  goal  of  such  a  protocol  when  expressed  in  this  formalism?  The  minimum  we  want  to  achieve  is 


A 


\A&B  and  B  =A  A  B. 


i.e.  both  parties  believe  they  share  a  secret  key  with  each  other.  However,  we  could  expect  to 
achieve  more,  for  example 


A 


B  =A  A  B  and  B  =A  =A  A  B 


which  is  called  key  confirmation.  In  words,  we  may  want  to  achieve  that,  after  the  protocol  has 
run,  A  is  assured  that  B  knows  he  is  sharing  a  key  with  A,  and  it  is  the  same  key  A  believes  she 
is  sharing  with  B. 

Before  analysing  a  protocol  using  the  BAN  logic  we  convert  the  protocol  into  logical  statements. 
This  process  is  called  idealization ,  and  is  the  most  error  prone  part  of  the  procedure  since  it  cannot 
be  automated.  We  also  need  to  specify  the  assumptions,  or  axioms,  which  hold  at  the  beginning 
of  the  protocol.  To  see  this  process  in  “real  life”  we  analyse  the  Wide-Mouth  Frog  protocol  for  key 
agreement  using  synchronized  clocks. 


Example:  Wide-Mouth  Frog  Protocol:  Recall  the  Wide-Mouth  Frog  protocol 


A 

— »  S  :  A,  {Ta,  B,  Kab} Kas , 

s  - 

— >  B  :  {Ts,  A,  Kab}Kbs. 

This  becomes  the  idealized  protocol 

A 

S  :  {Ta,A  ^  BjKas, 

S 

B  :  {Tg,  A  =A  Y’  B}Kbs. 

One  should  read  the  idealization  of  the  first  message  as  telling  S  that 

•  Ta  is  a  timestamp/nonce, 

•  Ka})  is  a  key  which  is  meant 

as  a  key  to  communicate  with  B 
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So  what  assumptions  exist  at  the  start  of  the  protocol?  Clearly  A ,  B  and  S  share  secret  keys, 
which  in  BAN  logic  becomes 


A\  =A  S, 

Ky 


S\  =A  S 
S I  =B  *b°  S 


B  |  =B  T?  S, 

There  are  a  couple  of  nonce  assumptions, 

S\  =#Ta  and  B\  =#T 

Finally,  we  have  the  following  three  assumptions 
•  B  trusts  A  to  invent  good  keys, 

B\  =(A\  =>,4  B), 


B  trusts  S  to  relay  the  key  from  A 

B 

A  knows  the  session  key  in  advance 

A\ 


(S|  =>A\  =A  ff  B). 


K„  6 


\A  B. 


Notice  how  these  last  three  assumptions  specify  the  problems  we  associated  with  this  protocol  in 
the  earlier  section.  Using  these  assumptions  we  can  now  analyse  the  protocol. 

•  Let  us  see  what  we  can  deduce  from  the  first  message 

A-^S-.iTa.A1^  B}Kas. 

—  Since  S  sees  the  message  encrypted  under  Kas  he  can  deduce  that  A  said  the  message. 
—  Since  Ta  is  believed  by  S  to  be  fresh  he  concludes  the  whole  message  is  fresh. 

—  Since  the  whole  message  is  fresh,  S  concludes  that  A  currently  believes  the  whole  of 
it. 

—  S  then  concludes 

51  =A\  =A  B, 

which  is  what  we  need  to  conclude  so  that  S  can  send  the  second  message  of  the 
protocol. 

•  We  now  look  at  what  happens  when  we  analyse  the  second  message 


Kr 


S  — >  B  :  {TS,A |  =A  nb  B}K{ 


bs  * 


—  Since  B  sees  the  message  encrypted  under  K^s  he  can  deduce  that  S  said  the  message. 

—  Since  Ts  is  believed  by  B  to  be  fresh  he  concludes  the  whole  message  is  fresh. 

—  Since  the  whole  message  is  fresh,  B  concludes  that  S  currently  believes  the  whole  of 
it. 

—  So  B  believes  that  S  believes  the  second  part  of  the  message. 

—  But  B  believes  S  has  authority  on  whether  A  knows  the  key  and  B  believes  A  has 
authority  to  generate  the  key. 

From  the  analysis  of  both  messages  we  can  conclude 


and 


B  =A*&  B 


B |  =A\  =A  *hb  B. 

Kab 


Combining  this  with  our  axiom,  A \  =A  o  B,we  conclude  that  the  key  agreement  protocol 
is  sound.  The  only  requirement  we  have  not  met  is  that 

A  =B  =A  A  B, 
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i.e.  A  does  not  achieve  confirmation  that  B  has  received  the  key. 

Notice  what  the  application  of  the  BAN  logic  has  done  is  to  make  the  axioms  clearer,  so  it  is  easier 
to  compare  which  assumptions  each  protocol  needs  to  make  it  work.  In  addition  it  clarifies  what 
the  result  of  running  the  protocol  is  from  all  parties’  points  of  view.  However,  it  does  not  prove 
the  protocol  is  secure  in  a  cryptographic  sense.  To  do  that  we  need  a  more  complex  method  of 
analysis,  which  is  less  amenable  to  computer  application. 


18.6.  The  Game-Based  Method  of  Protocol  Analysis 

To  provide  a  more  rigorous  analysis  of  protocols  (which  does  not  assume,  for  example,  that  encryp¬ 
tion  works  as  a  perfect  black  box),  we  now  present  a  method  of  analysis  which  resembles  the  games 
we  used  to  introduce  encryption,  signatures  and  MACs.  Recall  that  there  we  had  an  adversary  A 
who  tried  to  achieve  a  certain  goal,  using  a  set  of  powers.  The  goal  was  presented  as  some  winning 
condition,  and  the  powers  were  presented  as  giving  the  adversary  access  to  some  oracles.  For  key 
agreement  protocols  we  will  adopt  a  winning  condition  which  is  akin  to  the  Real-or-Random  win¬ 
ning  condition  for  the  encryption  games,  i.e.  an  adversary  should  not  be  able  to  tell  the  difference 
between  a  real  key  agreed  during  a  key  agreement  protocol  and  a  key  chosen  completely  at  random. 

What  is  more  complicated  is  the  definition  of  the  oracles.  In  our  previous  examples  the  oracles 
were  relatively  simple;  they  either  encrypted,  decrypted,  signed  or  verified  some  data  given  some 
hidden  key.  As  a  key  agreement  protocol  is  a  protocol ,  data  is  passed  between  players;  in  addition 
there  could  be  many  keys  within  the  protocol  and  so  one  piece  of  data  could  be  sent  to  be  processed 
by  different  entities  using  different  keys  (recall  we  assume  the  adversary  has  control  of  the  network). 
In  addition  we  have  seen  attacks  on  key  agreement  protocols  which  rely  on  messages  being  passed 
to  additional  entities,  and  not  just  two.  Thus  we  need  to  model  security  where  there  are  many 
participants.  In  addition  participants  may  be  interacting  with  many  parties  at  the  same  time,  and 
could  be  interacting  with  the  same  party  many  times  (e.g.  a  client  connecting  to  a  web  server  in 
multiple  sessions). 


Modelling  the  Participants:  We  first  set  up  some  parties.  These  are  going  to  be  users  U  E  U 
who  start  our  game  being  honest  participants  in  the  protocol.  For  symmetric  key  based  protocols 
each  entity  will  have  a  list  of  secret  keys  kjjs  of  keys  shared  with  a  special  trusted  party  S.  For 
public  key  protocols  each  party  will  have  a  public/private  key  pair  (ply,  sly),  where  sly  will  be 
held  by  the  party  and  the  public  key  p ly  will  be  held  (in  a  certified  form)  by  all  other  parties.  It 
may  be  the  case  that  parties  have  two  public/private  key  pairs,  one  for  public  key  encryption  and 

one  for  public  key  signatures,  in  which  case  we  will  denote  them  by  (ply  \  sly  ^)  and  (ply  \  sly  ^). 
Each  party  U  will  have  a  state  stately  =  {ly,  Ay},  which  consists  of  the  secret  data  above  ly  and  a 
Boolean  variable  Ay.  The  variable  Ay  denotes  whether  the  party  is  corrupted  or  not.  At  the  start 
of  the  game  Ay  is  set  to  false. 

For  public  key  protocols  we  are  also  going  to  have  a  set  of  users  V  E  V  for  whom  the  adversary 
can  register  its  own  public  keys.  For  the  users  in  V  we  do  not  assume  that  the  adversary  knows 
the  underlying  private  key,  only  that  it  registers  public  keys  of  its  choice. 

The  adversary  can  interact  with  these  parties  in  the  following  manner.  It  has  an  oracle  0y, 
for  each  user  U  E  U,  to  which  it  can  pass  a  single  command  corrupt;  on  receiving  this  command 
the  game  returns  the  value  ly  to  the  adversary  and  sets  Ay  <—  true.  Thus  this  oracle  allows  the 
adversary  to  take  control  of  any  party  she  desires.  In  addition  there  is  an  oracle  Oy,  for  each  user 
V  E  V,  to  which  it  can  pass  the  single  command  (register,  ply)  which  registers  ply  as  the  public 
key  of  the  user  V. 


Modelling  the  Sessions:  As  well  as  the  data  associated  with  each  party  we  have  data  associated 
with  the  view  of  a  party  of  a  session  in  which  it  is  engaging.  Note  that  this  is  about  the  view  of 
the  party  and  may  not  correspond  to  the  truth.  Each  party  U  E  U  may  have  multiple  sessions 
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which  it  thinks  it  is  having  with  party  U'  G  U  U  V;  party  U  will  index  these  sessions  by  an  integer 
variable  i.  The  goal  of  the  session  is  to  agree  an  ephemeral  secret  key  and  as  the  key 

agreement  protocol  proceeds  the  session  will  have  different  states,  which  we  shall  denote  by 
The  possible  states  are: 

•  _L:  This  is  the  initial  value  of  and  indicates  that  nothing  has  happened. 

•  accept:  This  indicates  that  party  U  thinks  that  a  key  has  been  agreed,  and  party  U  thinks 
it  is  equal  to  ztu,U',i- 

•  reject:  This  indicates  that  party  U  has  rejected  this  session  of  the  protocol. 

•  revealed:  This  indicates  that  the  adversary  has  obtained  the  key  see  below  for  how 

she  did  this. 

Within  a  session  we  also  maintain  a  list  of  messages  received  and  sent,  a  so-called  transcript 
Tu,u',i  —  {mi,  ri,  m2,  ^2,  •  •  •  ,  mn,rn},  where  rrij  is  the  jth  message  received  by  party  U  in  its  ith 
session  with  U\  and  rn  is  the  associated  response  which  it  made.  The  transcript  is  initially  set  to 
be  the  empty  set.  Thus  the  session  state  is  given  by  session u,U',i  —  ^u,u'  ,iiTu,u'  su,u',i}, 

where  su,u',i  is  some  protocol  specific  state  information. 

To  obtain  information  about  the  sessions,  and  to  drive  the  protocol,  the  adversary  has  a  message 
oracle  O  which  takes  four  arguments,  0(U ,  which  is  processed  as  follows: 

•  If  m  =  reveal  and  £ u,ur,i  =  accept  then  set  £ u,U',i  revealed  and  return  f°  the 

adversary. 

•  If  m  =  in  it  and  Tu,u',i  —  0  then  start  a  new  protocol  session  with  U'  and  let  r  be  the  initial 

message,  set  7 u,U',i  {m,r}  and  return  r  to  the  adversary. 

•  Otherwise  if  Tu,u',i  —  {mi,  ri,  m2, 7*2, . . . ,  mn,  rn}  and  the  message  m  is  the  (n+  l)st  input 

message  in  the  protocol.  The  oracle  computes  the  response  rn+ 1  and  adds  the  message 
and  response  to  and  returns  rn+ 1  to  the  adversary.  The  final  response  in  a  protocol, 

to  signal  the  end,  is  to  set  rn+i  _L. 

Notice  that  messages  are  not  sent  between  participants;  we  allow  the  adversary  to  do  that.  She 
can  decide  which  message  gets  sent  where,  or  if  it  even  gets  sent  at  all. 

Matching  Conversations:  We  now  need  to  define  when  two  such  sessions,  run  by  different  par¬ 
ticipants,  correspond  to  the  same  protocol  execution.  This  is  done  using  the  concept  of  a  matching 
conversation.  Suppose  we  have  two  transcripts 

Tu,u’,i  =  {init,ri,m2,r2,...,m„,r„}, 

Tw,W',i’  =  {M;rtTO2;r2!  •  •  •  1  mk:  rk}  , 

such  that 

•  m'j  =  1  for  i  >  1, 

•  rrii  =  r'i_1  for  i  >  1, 

•  n  even:  rn  =T  and  k  =  n  —  1, 

•  n  odd:  =T  and  k  =  n, 

•  U  =  Wf  and  U'  =  W. 

In  such  a  situation  we  say  the  two  transcripts  are  matching  conversations. 

Winning  the  Game:  Recall  that  our  winning  condition  is  going  to  be  akin  to  the  Real-or- Random 
definition  for  encryption  security.  So  the  game  selects  a  hidden  bit  b;  if  b  =  0  then  the  adversary 
will  be  given  a  random  key  for  its  challenge,  otherwise  it  will  be  given  a  real  key.  So  we  need  to 
decide  on  which  key  agreement  protocol  execution  the  adversary  will  be  challenged.  Instead  of 
defining  one  for  the  adversary  we  let  the  adversary  choose  its  own  session  which  which  it  wants  to 
be  queried,  subject  to  some  constraints  which  we  shall  discuss  below.  This  is  done  by  means  of  a 
so-called  test  oracle  Chest- 
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The  test  oracle  takes  as  input  a  tuple  (£7,  U',i),  with  U,U'  <EU  and  if  b  =  1  it  returns  Atpgj>  g, 
and  if  b  =  0  it  returns  a  random  key  of  the  same  size  as  Atpyj/y.  The  test  oracle  may  only  be  called 
once  by  the  adversary,  and  when  making  the  call  the  adversary  has  to  obey  certain  rules  of  the 
game.  The  restrictions  on  the  test  oracle  query  are  for  two  reasons;  firstly  to  ensure  that  the  query 
makes  sense,  and  secondly  to  ensure  that  the  game  is  not  too  easy  to  win.  For  our  purposes  the 
restrictions  will  be  as  follows: 

•  =  accept.  Otherwise  the  key  is  either  not  defined,  or  it  has  been  revealed  and  the 
adversary  already  knows  it. 

•  There  should  be  no  revealed  session  (W,W',if)  which  has  a  matching  conversation  with 
(£7,  U',i).  Since  a  matching  conversation  should  correspond  to  the  same  session  key,  this 
protects  against  a  trivial  break  in  these  circumstances. 

•  tvjjf  =  np  =  false.  In  other  words  both  £7  and  its  partner  U'  are  not  corrupted. 

Note  that  we  do  not  assume  that  the  test  session  has  any  matching  conversations  at  all.  It  could 
be  a  conversation  with  the  adversary. 

At  the  end  of  the  game  (which  we  call  the  Authenticated  Key  Agreement  game)  the  adversary 
A  outputs  its  guess  b'  as  to  the  hidden  bit  b,  and  we  define  the  advantage  of  the  adversary  A  against 
protocol  II  in  the  usual  way  as 

Adv^KA(A;  7 ip,  ns)  =  2  • 

where  np  upper  bounds  the  number  of  participants  with  which  A  interacts  and  ns  upper  bounds 
the  number  of  sessions  executable  between  two  participants.  To  avoid  trivial  breaks  of  some  simple 
protocols  we  also  impose  the  restriction  that  after  the  test  query  on  (£7,  U',i)  is  made  we  do  not 
allow  calls  to  Op  and  Op/,  he.  at  the  end  of  the  game  £7  and  U'  must  still  be  uncorrupted.  If  we 
want  to  model  protocols  with  forward  secrecy  then  we  remove  this  restriction;  thus  the  test  session 
will  consist  of  messages  which  were  sent  and  received  by  £7  before  the  corruption  of  either  party 
occured.  We  shall  not  discuss  forward  secrecy  anymore,  except  to  point  out  when  protocols  are  not 
forward  secure. 

For  a  protocol  to  be  deemed  secure  we  need  more  than  just  that  the  advantage  is  small. 
Definition  18.1  (AKA  Security).  An  authenticated  key  agreement  protocol  II  is  said  to  be  secure 

if 

(1)  There  is  a  matching  conversation  between  sessions  ( U,U',i )  and  ( U',U,j )  and  U  and  V 
are  uncorrupted,  then  we  have  Ypp/ 1  =  Yp/  p  7-  =  accept  and  et  =  ztpp/  i  =  Atp/  p  p  and 

5  5  5  5  c/  5  5  5  5c/ 

et  is  distributed  uniformly  at  random  over  the  desired  key  space. 

(2)  Adv^KA(A)  is  “small”. 

Public-Key-Encryption-Based  Key  Transport  Example:  To  define  a  protocol  within  the 
above  game  framework  we  simply  need  to  provide  the  information  as  to  how  the  oracle  0(U,U' ,i,m) 
should  behave.  We  present  a  number  of  examples,  all  based  on  public  key  techniques;  we  leave 
symmetric-key-based  protocols  to  the  reader.  We  start  with  the  basic  key  transport  based  on  public 
key  encryption,  in  particular  the  third  version  of  the  protocol  given  in  Figure  18.11.  The  “code” 
for  the  0(U,U' ,i,  m)  oracle  in  this  case  is  given  by: 

•  If  m  =  reveal  and  Ypfpt y  =  accept  then  set  Yp gp/ g  <—  revealed  and  return  ztp^p/g  to  the 
adversary. 

•  If  m  =  init  and  Tu,u',i  =  0  then  set  etUtU>ti  <— K,  cv  <—  {epu'S,  sv  <-  Sigse(s)(c), 

r  ( cp,sp ),  opgj'g  accept.  Tp^p/y  =  {init,r},  and  return  r  to  the  adversary. 

•  If  m  =  ( cppsp /),  Tpp/  i  =  0  and  Verify  uS)(sp/ ,  cpt)  —  true  then  set  m*  e  ue)(c).  If 

m*  =T  then  abort.  Parse  m*  as  (A\\ztu,u' ,i)  and  abort  if  A  7^  U' .  Finally  set  Yp gj> g 
accept,  Tp^p'g  =  {m,  T},  and  return  T  to  the  adversary. 


Pr[A  wins 
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•  Otherwise  abort. 

However,  earlier  we  said  that  this  had  a  replay  attack.  This  replay  attack  can  now  be  described  in 
our  AKA  model.  We  take  the  two- user  case,  and  pass  the  output  from  the  initiator  to  the  responder 
twice.  Thus  the  responder  thinks  they  are  engaging  with  two  sessions  with  the  initiator.  We  then 
reveal  on  one  of  these  sessions  and  by  testing  the  other  and  so  we  can  decide  is  the  test  oracle 
returns  a  random  value  or  not. 

•  Assume  two  parties  U  and  U' . 

•  (cu,su)  <-  0(U,U',1,  init). 

•  0(U\U,l,(cu,su)). 

•  (9(£7',  £7,  2,  (qy,  sjj)). 

•  r  <-  0(U',U,  2,  reveal). 

•  If  t  =  F  then  return  b'  =  1. 

•  Return  b'  =  0. 

An  obvious  fix  is  to  remove  the  ability  for  replays,  but  without  each  party  maintaining  a  large 
state  this  is  relatively  complex.  As  remarked  above,  the  preferred  fix  would  be  to  have  each  party 
contribute  entropy  to  each  protocol  run. 

Diffie— Heilman  Example:  We  now  turn  to  the  basic  DifRe-Hellman  protocol  from  Figure  18.12. 
We  assume  we  are  working  in  a  finite  abelian  group  G  of  order  q  generated  by  g.  The  “code”  for 
the  0(U,  U' ,  i,  m)  oracle  in  this  case  is  given  by: 

•  If  m  =  reveal  and  ^u,U',i  =  accept  then  set  ^u,U',i  revealed  and  return  1°  the 

adversary. 

•  If  m  =  init  and  Tu,u',i  —  0  then  su,uf,i  TLfqL^  r  <—  Tu,u',i  —  {init,r},  and  return 

r  to  the  adversary. 

•  If  TO  G  G  and  Tu,u’,i  =  0  then  su,u’,i  Z/g Z,  r  <-  gsu,u’,i ;  <-  mSu-u'’ % 

accept,  Tu,u',i  —  and  return  r  to  the  adversary. 

•  If  m  e  G  and  Tu,u\i  =  {init,  7-3}  then  <-  mSu^u^\  'LUiU',i  <-  accept,  Tu,u',i  = 

{init,  ri,ra,  _L},  and  return  T  to  the  adversary. 

•  Otherwise  abort. 

We  already  know  this  is  not  a  secure  authenticated  key  agreement  protocol  due  to  the  man-in-the- 
middle  attack.  Indeed  no  public  keys  are  even  used  within  the  protocol,  so  it  does  not  have  any 
authentication  at  all  within  it.  However,  to  illustrate  how  this  is  captured  in  the  AKA  security 
model  we  present  an  attack,  within  the  model  Our  adversary  performs  the  following  steps: 

•  Assume  four  distinct  parties  U,U',K  and  L. 

•  ejj  <—  G(U ,  [/',  1,  init). 

•  ex  <-  0(K ,  L,  1,  eu)- 

•  0(U,U’,l,eK). 

•  F  <—  0(K ,  L,  1,  reveal). 

•  l<^OTea(U,U',l). 

•  If  t  =  F  then  return  b'  =  1. 

•  Return  b'  =  0. 

Note  that  this  attack  works  since  the  test  session  (£/,£/',  1)  does  not  have  a  matching  conversation 
with  any  other  session.  It  does  “match”  with  the  session  (iF,  L,  1)  in  terms  of  message  flows,  but  we 
do  not  have  U  —  L  and  U'  =  K.  This  allows  the  adversary  to  reveal  the  key  for  session  (iF,  L,  1), 
whilst  still  allowing  session  ([/,  U\  1)  to  be  passed  to  the  test  oracle.  Also  note  that  the  sessions 
(17,  U' ,  1)  and  (IF,  L,  1)  agree  on  the  same  key,  and  that  neither  U  nor  U'  have  been  corrupted. 
Thus  the  above  method  means  the  adversary  can  win  the  AKA  security  game  with  probability  one 
for  the  DifRe-Hellman  protocol. 
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Signed  Diffie— Heilman  Example:  We  now  turn  to  the  Signed  Diffie-Hellman  protocol  from 
Figure  18.13.  The  “code”  for  the  0(U,  U',  z,  m)  oracle  is  now  given  by: 

•  If  m  =  reveal  and  £ u,U',i  —  accept  then  set  Yiu,u',i  Y-  revealed  and  return  ztu,U',i  to  the 
adversary. 

•  If  m  =  init  and  Tu,u',i  =  0  then  su,uf,i  Y-  TLjqTL,  eu  Y-  gSu’u'>%  sv  Y-  Sig ^(eu),  r  Y- 
( eu,su ),  Tu,u',i  =  {init,  r},  and  return  r  to  the  adversary. 

•  If  m  =  (eu'iSu')  with  ejj'  G  G,  7u,u',i  =  0  and  Verify^  (suu  euf)  —  true  then  su,uf,i  Y- 
TL/qTL,  eu  Y-  gSu^'^\  sjj  Y-  Sig stu(eu),  r  Y-  ( e\j,s\j ),  ttu,U',i  Y-  eu'Su’u'’ S  £ u,w,i  Y-  accept, 
Ju,U',i  —  {m,  r},  and  return  r  to  the  adversary. 

•  If  m  =  (e^//,  s^//)  with  e^//  G  G,  Tjj,u',i  —  {init,  (e^y,  s//)}  and  Verifyp£  (sj//,  ej//)  =  true  then 

Y-  et//Sa’a/’%  £//,//', *  Y-  accept,  7*7,E/',i  =  {init,  (e^y,  st/),  m,  _L},  and  return  T  to  the 
adversary. 

•  Otherwise  abort. 

Within  the  AKA  security  model  we  can  present  the  following  attack,  which  formalizes  the  replace¬ 
ment  of  signature  attack  discussed  above.  Our  adversary  performs  the  following  steps: 

•  Assume  three  distinct  parties  G,  U'  and  K. 

•  <—  Gx(corrupt). 

•  (eu,su)  <-  0(U,U',  1,  init). 

•  sk  y-  Sig szK(eu)- 

•  (et/c  su')  G(G',  K,  1,  (eu,  sk))- 

•  0(U,Uf,l,(euf,suf)). 

•  !'y-  0{U',K,  1,  reveal). 

•  ^GTest(G,G',l). 

•  If  t  =  F  then  return  b'  =  1. 

•  Return  b'  =  0. 

In  this  execution  only  party  K  is  corrupted,  and  in  particular  neither  G  nor  U'  have  been  corrupted. 
However,  party  G  thinks  it  is  talking  to  party  G7,  whereas  party  G7  thinks  it  is  talking  to  party  K\ 
yet  in  the  sessions  executed  by  G  and  G7  they  agree  on  the  same  ephemeral  key.  Hence,  the  above 
attack  is  valid  within  the  AKA  model  and  succeeds  with  probability  one. 

Blake-Wilson— Menezes  Example:  It  turns  out  that  the  Blake-Wilson-Menezes  protocol  also 
has  an  attack  on  it,  which  makes  use  of  the  fact  that  we  allow  the  adversary  to  register  public  keys 
of  its  own  choosing.  We  describe  the  attack  via  the  following  pseudo-code,  assuming  the  underlying 
group  is  of  prime  order  q. 

•  Assume  two  legitimate  parties  G  and  G7. 

•  t  Y-  (Z/gZ)*. 

•  pty  Y-  p tjj1'. 

•  CV  (register,  ply). 

•  (eu)  Y-  G(G,G7,  1,  init). 

•  (ejjt)  <—  G(G7,  V,  1,  eu). 

•  G(G,G7,  l,e\j,). 

•  I'y-  G(G7,  V,  1,  reveal). 

•  Z^OTe5t(U,U':l). 

•  If  t  =  t'  then  return  b'  =  1. 

•  Return  b'  =  0. 

Notice  that  G  thinks  it  is  talking  to  G7,  whereas  G7  thinks  it  is  talking  to  V.  Let  x  be  the  secret 
ephemeral  key  chosen  by  G  in  the  above  execution,  and  y  is  the  secret  ephemeral  key  chosen  by 
G7.  Then  the  secret  key  for  the  (test)  session  G  is  engaged  in  is  equal  to 

t  Y-  tf(pVVG/GSa)  =  H(gSu,'x,gt'y'Su), 
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whilst  the  secret  key  for  the  (revealed)  session  U'  is  engaged  in  is  equal  to 

t'  4—  H(tlv8v,  pV)  =  H(gx'Su' ,  gh'^'V). 

So  the  two  keys  are  identical,  and  yet  the  session  executed  by  Ur  is  not  a  matching  session  with 
that  executed  by  U. 

Modified  Blake-Wilson— Menezes  Example:  The  problem  with  the  previous  protocol  is  that 
the  key  is  derived  does  not  depend  on  whether  the  sessions  involved  have  matching  conversations. 
We  therefore  modify  the  Blake-Wilson-Menezes  protocol  in  the  following  way,  and  we  are  then  able 
to  prove  the  protocol  is  secure.  Again  we  assume  public/private  key  pairs  are  given  by  =  ga 
for  Alice  and  ptB  =  gb  for  Bob,  and  that  the  message  flows  are  still  defined  by 

A  — »  B  :  At  a  =  gx , 

B  — »  A  :  z%b  —  gv  • 

The  key  difference  is  in  how  the  shared  secret  key  k  is  derived;  we  also  include  the  transcript  of  the 
protocol  within  the  key  derivation.  Alice  derives  it  via  the  equation 

k  <-  H{ ptBx,  ztBa,  peA,  ptB,  ztB)  =  H(gb  x,gya,ga,  gb,  gx,  fl»). 

While  Bob  derives  the  same  key  using  the  equation 

k  <-  H(ztAb,ptAy,ptA,ptB,ztA,ztB)  =  H(gx'b,ga'v,ga,gb,gx,gv). 

In  terms  of  “code”  for  our  Q(U ,  U',i,m)  oracle  we  have  that  the  protocol  is  defined  by 

•  If  m  =  reveal  and  E u,U',i  =  accept  then  set  ^u,U',i  revealed  and  return  ztu,U',i  to  the 
adversary. 

•  If  m  =  init  and  Ju,U',i  =  0  then  sjjyj^i  Z/gZ,  r  7 u,U',i  =  {init,  r},  and  return 

r  to  the  adversary. 

•  If  TO  G  G  and  Tu,u',i  =  0  then  sU:U>ti  <-  'LjqL ,  r  <-  gsv,u',i ;  EUfUiti  <-  accept,  Tu,u>,i  = 

&u,u',i  t—  H(m^u ,piuiSu-u'-t ,pijji,piij,m,r), 
and  return  r  to  the  adversary. 

•  If  TO  G  G  and  Tu,u',i  =  {init,  r*i }  then  <-  accept,  Tu,u',i  =  {init,  n,  to,  _L}, 

u,u',i  H(p^u'Sv'u' ,z imSu ,piB,p%iji ,ri,m), 

and  return  T  to  the  adversary. 

•  Otherwise  abort. 

We  can  now  prove  that  this  protocol  is  secure  assuming  the  Gap  DifRe-Hellman  problem  is  hard. 

Theorem  18.2.  Let  A  denote  an  adversary  against  the  AKA  security  of  the  modified  Blake-Wilson- 
Menezes  protocol,  operating  with  np  legitimate  users  each  of  whom  may  have  up  to  ns  sessions, 
where  H  is  modelled  as  a  random  oracle.  Then  there  is  an  adversary  B  against  the  Gap  Diffie- 
Hellman  problem  in  the  group  G  such  that 

Adv^KA (A;  np,  715-)  <  n2P  • ns  •  Adv^ap_DHP(5). 

From  the  theorem  it  is  easy  to  deduce  that  the  modifed  Blake-Wilson-Menezes  protocol  is  secure, 
assuming  the  Gap  Difhe-Hellman  problem  is  hard. 

Proof.  We  assume  we  are  given  an  algorithm  A  against  the  AKA  security  of  the  protocol  and 
we  want  to  create  the  algorithm  B  against  the  Gap  Difhe-Hellman  problem.  Algorithm  B  will 
maintain  a  hash  list  17-List  consisting  of  elements  in  G6  x  {0,  l}fc,  representing  the  calls  to  H  on 
sextuples  of  elements  in  G  and  the  corresponding  outputs. 

Algorithm  B  has  as  input  ga  and  gy  and  is  asked  to  compute  ga'y ,  given  access  to  an  oracle 
G^ddh  which  checks  tuples  for  being  Difhe-Hellman  tuples.  Algorithm  B  sets  up  np  public/private 
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keys  for  algorithm  A  as  in  the  legitimate  game,  but  picks  one  user  U*  E  hi  and  sets  its  public  key  to 
be  ga .  Algorithm  B  also  picks  a  session  identifier  i*  E  {1, . . . ,  ns},  and  another  identity  bb*  E  hi. 
We  let  b  denote  the  private  key  of  entity  kb*;  notice  that  this  is  known  to  algorithm  B ,  and  hence 
is  marked  in  blue. 

Algorithm  B  then  calls  algorithm  A  and  responds  to  its  oracle  queries  as  in  the  real  game  except 
for  the  following  cases: 

•  If  Ojj*  (corrupt)  is  called  then  abort. 

•  If  (9(1/*,  W*,  i*,  reveal)  is  called  then  abort. 

•  If  O (kb*,  £/*,  z*,  m)  is  called  then  respond  with  gx. 

•  If  Ojest(U,  W,i)  is  called  with  U  7^  [/*,  kb  7^  kb*  or  i  /  i*  then  abort.  Otherwise  respond 
with  a  random  value  in  {0,  l}k. 

The  only  problem  is  in  maintaining  consistency  between  any  reveal  queries  made  by  A  and  any 
calls  made  by  A  to  the  hash  function  H .  In  other  words  we  need  to  maintain  consistency  of  the 
H- List  in  both  the  reveal  and  hash  function  queries  made  by  A.  However,  such  consistency  can  be 
maintained  in  the  same  manner  as  in  Theorem  16.9  using  the  (9ddh  oracle  to  which  algorithm  B 

r\ 

has  access  . 

Note  that  the  probability  that  B  aborts  is  1/ (np-ns).  If  the  algorithm  A  is  able  to  win  the  AKA 
game  with  non- negligible  probability  then  A  must  make  a  hash  query  on  the  critical  sextuple.  As 
we  are  using  the  modified  Blake-Wilson-Menezes  protocol,  no  reveal  query  made  by  A  will  result 
in  the  same  input  to  the  hash  function  as  the  critical  query.  Since  no  other  reveal  query  will  have 
the  same  transcript  etc.  Thus  to  have  any  advantage  A  must  make  the  critical  call  to  the  hash 
function. 

We  now  examine  what  this  sextuple  will  be;  we  let  m  denote  the  ephemeral  public  key  passed 
to  bb*  in  the  test  session  and  let  x  =  dlog  (m);  note  that  B  does  not  know  x  so  we  mark  this  value 
in  red.  If  U*  was  an  initiator  oracle  then  the  sextuple  is 

(p %w*x i  P^u*  >  P^w*  >  5  htw* )  —  (p ^w*x 1 9V  9a  ^  P^w*  >  9y)  • 

Thus  in  this  case  the  algorithm  B  simply  looks  for  the  critical  query  on  the  H- List  using  its  (9ddh 
oracle  and  outputs  the  second  value  of  the  tuple  as  the  answer  to  the  Gap  DifRe-Hellman  problem. 
If  U*  was  the  responder  oracle,  then  a  similar  method  can  be  applied  using  the  first  value  of  the 
tuple,  as  the  critical  sextuple  is  given  by 

(ctw*a,  P^w*x i  P^w*  5  P^u*  5  5  Atjj*)  —  (, 9y  P $w*x >  P^w*  >  9a 5  9y > m )  • 

□ 


A  similar  proof  can  be  made  for  the  MQV  protocol,  which  due  to  space  we  do  not  cover  here. 


Chapter  Summary 


•  Digital  certificates  allow  us  to  bind  a  public  key  to  some  other  information,  such  as  an 
identity.  This  binding  of  key  with  identity  allows  us  to  solve  the  problem  of  how  to 
distribute  authentic  public  keys. 

•  Various  PKI  systems  have  been  proposed,  all  of  which  have  problems  and  benefits  associ¬ 
ated  with  them. 

•  Implicit  certificates  aim  to  reduce  the  bandwidth  requirements  of  standard  certificates, 
however  they  come  with  a  number  of  drawbacks. 

o 

The  details  are  tedious  and  are  left  to  the  reader. 
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•  A  number  of  key  agreement  protocols  exist  based  on  a  trusted  third  party  and  symmet¬ 
ric  encryption  algorithms.  These  protocols  require  long-term  keys  to  have  been  already 
established  with  the  TTP;  they  may  also  require  some  form  of  clock  synchronization. 

•  DifRe-Hellman  key  exchange  can  be  used  by  two  parties  to  agree  on  a  secret  key  over  an 
insecure  channel.  However,  DifRe-Hellman  is  susceptible  to  a  man-in-the-middle  attack 
and  so  requires  some  form  of  authentication  of  the  communicating  parties. 

•  To  obtain  authentication  in  the  DifRe-Hellman  protocol,  various  diRerent  options  exist,  of 
which  we  discussed  the  STS  protocol,  the  Blake-Wilson-Menezes  protocol  and  the  MQV 
protocol. 

•  Various  formal  logics  exist  to  analyse  such  protocols.  The  most  influential  of  these  has  been 
the  BAN  logic.  These  logics  help  to  identify  explicit  assumptions  and  problems  associated 
with  each  protocol;  they  can  identify  attacks  but  usually  do  not  provide  security  proofs. 

•  For  a  computational  proof  of  security  one  needs  to  define  an  elaborate  security  model. 
Such  models  capture  a  multitude  of  attacks;  resulting  in  the  most  simple  protocols  being 
deemed  insecure. 


Further  Reading 

The  paper  by  Burrows,  Abadi  and  Needham  is  a  very  readable  introduction  to  the  BAN  logic 
and  a  number  of  key  agreement  protocols  based  on  static  symmetric  keys;  much  of  our  treatment 
of  this  subject  is  based  on  this  paper.  Our  treatment  of  security  models  for  key  agreement  is 
based  on  work  started  in  the  paper  by  Bellare  and  Rogaway.  Our  proof  of  security  of  the  modified 
Blake-Wilson-Menezes  protocol  is  based  on  the  paper  by  Kudla  and  Paterson. 

M.  Bellare  and  P.  Rogaway.  Entity  authentication  and  key  distribution.  Advances  in  Cryptology 
-  Crypto  1993,  LNCS  773,  232-249,  Springer,  1994. 

M.  Burrows,  M.  Abadi  and  R.  Needham.  A  Logic  of  Authentication.  Digital  Equipment  Corpora¬ 
tion,  SRC  Research  Report  39,  1990. 

C.  Kudla  and  K.G.  Paterson.  Modular  security  proofs  for  key  agreement  protocols.  Advances  in 
Cryptology  -  Asiacrypt  2005,  LNCS  3788,  549-565,  Springer,  2005. 


Part  4 


Advanced  Protocols 


Encryption,  hash  functions,  MACs  and  signatures  are  only  the  most  basic  of  cryptographic 
constructions  and  protocols.  We  usually  think  of  them  as  being  carried  out  between  a  sender  and 
a  receiver  who  have  the  same  security  goals.  For  example,  in  encryption  both  the  sender  and  the 
receiver  probably  wish  to  keep  the  message  secret  from  an  adversary.  In  other  words  the  adversary 
is  assumed  to  be  someone  else. 

In  this  section  we  shall  detail  a  number  of  more  advanced  protocols.  These  are  mainly  protocols 
between  two  or  more  people  in  which  the  security  goals  of  the  different  parties  could  be  conflicting, 
or  different.  For  example  in  an  electronic  election  voters  want  their  votes  to  be  secret,  yet  all  parties 
want  to  know  that  all  votes  have  been  counted,  and  all  parties  want  to  ensure  against  a  bad  voter 
casting  too  many  votes  or  trying  to  work  out  how  someone  else  has  voted.  Hence,  the  adversaries 
are  also  the  parties  in  the  protocol,  not  necessarily  external  entities. 

First  we  focus  on  secret  sharing  schemes,  which  allow  a  party  to  share  a  secret  amongst  a 
number  of  partners.  This  has  important  applications  in  splitting  of  secrets  into  parts  which  can 
then  be  used  in  distributed  protocols.  Then  we  turn  to  commitment  schemes  and  oblivious  transfer. 
These  are  two  types  of  basic  protocols  between  two  parties,  in  which  the  parties  are  assumed  to  be 
mutually  untrusting,  i.e.  the  adversary  is  the  person  with  whom  you  are  performing  the  protocol. 
We  then  turn  to  the  concept  of  zero- knowledge  proofs.  In  this  chapter  we  also  examine  a  simple 
electronic  voting  scheme.  Finally  we  look  at  the  subject  of  secure  multi-party  computation,  which 
provides  an  interesting  application  of  many  of  our  preceding  algorithms. 


CHAPTER  19 


Secret  Sharing  Schemes 


Chapter  Goals 

•  To  introduce  the  notion  of  secret  sharing  schemes. 

•  To  give  some  simple  examples  of  general  access  structures. 

•  To  present  Shamir’s  scheme,  including  how  to  recover  the  secret  in  the  presence  of  active 
adversaries. 

•  To  show  the  link  between  Shamir’s  secret  sharing  and  Reed-Solomon  codes. 

•  As  an  application  we  show  how  secret  sharing  can  provide  more  security  for  a  certificate 
authority,  via  distributed  RSA  signature  generation. 

19.1.  Access  Structures 

Suppose  you  have  a  secret  5  which  you  wish  to  share  amongst  a  set  P  of  n  parties.  You  would  like 
certain  subsets  of  the  n  parties  to  recover  the  secret  but  not  others.  The  classic  scenario  might 
be  that  5  is  a  nuclear  launch  code  and  you  have  four  people,  the  president,  the  vice-president, 
the  secretary  of  state  and  a  general  in  a  missile  silo.  You  do  not  want  the  general  to  be  able  to 
launch  the  missile  without  the  president  agreeing,  but  to  maintain  deterrence  you  would  like,  in  the 
case  that  the  president  has  been  eliminated,  that  the  vice-president,  the  secretary  of  state  and  the 
general  can  agree  to  launch  the  missile.  If  we  label  the  four  parties  as  P,  V,  S  and  G,  for  president, 
vice-president,  secretary  of  state  and  general,  then  we  would  like  the  following  sets  of  people  to  be 
able  to  launch  the  missile 

{P,G}  and  {R,P,G}, 

but  no  smaller  sets.  It  is  this  problem  which  secret  sharing  is  designed  to  deal  with,  however  the 
applications  are  more  widespread  than  might  at  first  appear. 

To  each  party  we  distribute  some  information  called  a  share.  For  a  party  A  we  will  let  sa 
denote  the  secret  share  which  they  hold.  In  the  example  above  there  are  four  such  shares:  sp,  sy, 
ss  and  sg-  Then  if  the  required  parties  come  together  we  would  like  an  algorithm  which  combines 
their  relevant  shares  into  the  secret  5.  But  if  the  wrong  set  of  parties  come  together  they  should 
learn  no  information  about  s. 

Before  introducing  schemes  to  perform  secret  sharing  we  first  need  to  introduce  the  notion  of 
an  access  structure.  Any  subset  of  parties  who  can  recover  the  secret  will  be  called  a  qualifying  set , 
whilst  the  set  of  all  qualifying  sets  will  be  called  an  access  structure.  So  in  the  example  above  we 
have  that  the  two  sets 

{P,  G]  and  { V,  A,  G} 

are  qualifying  sets.  However,  clearly  any  set  containing  such  a  qualifying  set  is  also  a  qualifying 
set.  Thus 

{P,  G,  V},  {P,  G,  S}  and  {P,  V,  G,  Sj 

are  also  qualifying  sets.  Hence,  there  are  five  sets  in  the  access  structure.  For  any  set  in  the  access 
structure,  if  we  have  the  set  of  shares  for  that  set  we  would  like  to  be  able  to  reconstruct  the  secret. 
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Definition  19.1.  A  monotone  structure  on  a  set  V  is  a  collection  T  of  subsets  of  V  such  that 

•  v  g  r. 

•  If  there  is  a  set  A  £  T  and  a  set  B  such  that  A  C  B  C  V  then  B  G  T . 

Thus  in  the  above  example  the  access  structure  is  monotone.  This  is  a  property  which  will  hold 
for  all  access  structures  of  all  secret  sharing  schemes.  For  a  monotone  structure  we  note  that  the 
sets  in  T  come  in  chains,  A  C  B  C  C  CP.  We  shall  call  the  sets  which  form  the  start  of  a  chain 
the  minimal  qualifying  sets.  The  set  of  all  such  minimal  qualifying  sets  for  an  access  structure  T 
we  shall  denote  by  m(T).  We  can  now  give  a  very  informal  definition  of  what  we  mean  by  a  secret 
sharing  scheme: 

Definition  19.2.  A  secret  sharing  scheme  for  a  monotone  access  structure  T  over  a  set  of  parties 
V  with  respect  to  a  space  of  secrets  S  is  a  pair  of  algorithms  called  Share  and  Recombine  with  the 
following  properties: 

•  Share(s,r)  takes  a  secret  s  G  S  and  a  monotone  access  structure  and  determines  a  value 
sa  for  every  A  G  V .  The  value  sa  is  called  A’s  share  of  the  secret. 

•  Recombine  (FT)  takes  a  set  Hq  of  shares  for  some  subset  O  of  V ,  i.e. 

H0  =  {s0:Oe  O}. 

IfO  G  T  then  this  should  return  the  secret  s,  otherwise  it  should  return  nothing. 

A  secret  sharing  scheme  is  considered  to  be  secure  if  no  infinitely  powerful  adversary  can  learn  any¬ 
thing  about  the  underlying  secret  without  having  access  to  the  shares  of  a  qualifying  set.  Actually 
such  schemes  are  said  to  be  informat ion-theoretically  secure,  but  since  most  secret  sharing  schemes 
in  the  literature  are  informat  ion-theoretically  secure  we  shall  just  call  such  schemes  secure. 

In  this  chapter  we  will  consider  two  running  examples  of  monotone  access  structures,  so  as  to 
illustrate  the  schemes.  Both  will  be  on  sets  of  four  elements:  The  first  is  from  the  example  above 
where  we  have  V  =  {P,  V,  S',  G}  and 

r  =  {{P,  G},  {V,  S,  G},  {P,  G,  IT  {P,  G,  S},  {P,  V,  G,  5}}  . 

The  set  of  minimal  qualifying  sets  is  given  by 

m(r)  =  {{P,G},{V,S,G}}. 

The  second  example  we  shall  define  over  the  set  of  parties  V  =  {A,  B,  C,  D) ,  with  access  structure 

r  =  {  {A,  B},  {A,  C},  {A,  D},  {B,  C},  {B,  D},  {C,  D}, 

{A,  B,  C},  {A,  B,  D},  {B.  C,  D},  {A.  B,  C,  D}  }  . 

The  set  of  minimal  qualifying  sets  is  given  by 

m(T)  =  {{A,  B},  {A,  C},  {A,  D},  {B,  C},  {B,  D},  {C.  D}}  . 

This  last  access  structure  is  interesting  because  it  represents  a  common  form  of  threshold  access 
structure.  Notice  that,  in  this  access  structure,  we  require  that  any  two  out  of  the  four  parties 
should  be  able  to  recover  the  secret.  We  call  such  a  scheme  a  2-out-of-4  threshold  access  structure. 

One  way  of  looking  at  such  access  structures  is  via  a  Boolean  formulae.  Consider  the  set  m(T) 
of  minimal  qualifying  sets  and  define  the  formula: 

v  (a4 

Oem(r)  \OeO  J 

So  in  our  first  example  above  the  formula  becomes 

(PAG)  V  (V  AS  AG). 
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Reading  this  formula  out,  with  A  being  “and”,  and  V  being  “or”,  we  see  that  we  can  reconstruct 
the  secret  if  we  have  access  to  the  secret  shares  of 


(P  and  G)  or  (V  and  S  and  G). 

Notice  how  the  formula  is  in  disjunctive  normal  form  (DNF),  i.e.  an  “or”  of  a  set  of  “and”  clauses. 
We  shall  use  this  representation  below  to  construct  a  secret  sharing  scheme  for  any  access  structure. 


19.2.  General  Secret  Sharing 

We  now  turn  to  two  methods  for  constructing  secret  sharing  schemes  for  arbitrary  monotone  access 
structures.  They  are  highly  inefficient  for  all  but  the  simplest  access  structures  but  they  do  show 
that  one  can  cope  with  an  arbitrary  access  structure.  We  assume  that  the  space  of  secrets  S  is 
essentially  the  set  of  bit  strings  of  length  n  bits.  In  both  examples  we  let  s  G  S  denote  the  secret 
which  we  are  trying  to  share. 


19.2.1.  Ito— Nishizeki— Saito  Secret  Sharing:  Our  first  secret  sharing  scheme  makes  use  of 
the  DNF  Boolean  formula  we  presented  above.  In  some  sense  every  “or”  gets  converted  into  a 
concatenation  operation  and  every  “and”  gets  converted  into  a  0  operation.  This  can  at  first  sight 
seem  slightly  counterintuitive,  since  usually  we  associate  “and”  with  multiplication  and  “or”  with 
addition. 

The  sharing  algorithm  works  as  follows.  For  every  minimal  qualifying  set  O  G  m(r),  we  generate 
shares  S{  G  S,  for  1  <  i  <  Z,  at  random,  where  l  =  \0\  such  that  s\  0  •  •  •  0  si  =  s.  Then  a  party  A 
is  given  a  share  Si  if  A  occurs  at  position  i  in  the  set  O. 


Example:  Recall  we  have  the  formula 

(P  and  G)  or  (V  and  S  and  G). 
We  generate  five  elements  S{  from  S  such  that 

5  =  Si  0  52, 

=  53  0  54  0  55. 

The  four  shares  are  then  defined  to  be: 


Sp  A-  s  1, 


sy  53, 


sS  A-  54, 


SG  A-  52 


55 . 


You  should  check  that,  given  this  sharing,  any  qualifying  set  can  recover  the  secret,  and  only  the 
qualifying  sets  can  recover  the  secret.  Notice  that  party  G  needs  to  hold  two  times  more  data  than 
the  size  of  the  secret.  Thus  this  scheme  in  this  case  is  not  efficient.  Ideally  we  would  like  the  parties 
to  only  hold  the  equivalent  of  n  bits  of  information  each  so  as  to  recover  a  secret  of  n  bits. 


Example:  Now  our  formula  is  given  by 

(A  and  B)  or  (A  and  C)  or  (A  and  D)  or  (B  and  C)  or  (B  and  D)  or  (C  and  D). 
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We  now  generate  twelve  elements  Si  from  S',  one  for  each  of  the  distinct  terms  in  the  formula  above, 
such  that 


5  =  5 1  0  52, 

=  53  ©  54, 

=  55  ©  56, 

=  57  ©  58, 

=  59  0  5io, 
=  5n  0  512- 

The  four  shares  are  then  defined  to  be: 


5A  5i 

53 

5  5 , 

SB  S2 

57 

59, 

Sc  5 4 

58 

5ll, 

Sp  56 

510 

512- 

You  should  again  check  that,  given  this  sharing,  any  qualifying  set  and  only  the  qualifying  sets  can 
recover  the  secret.  We  see  that  in  this  case  every  share  contains  three  times  more  information  than 
the  underlying  secret. 


19.2.2.  Replicated  Secret  Sharing:  The  above  is  not  the  only  scheme  for  general  access  struc¬ 
tures.  Here  we  present  another  one,  called  the  replicated  secret  sharing  scheme.  In  this  scheme  we 
first  create  the  sets  of  all  maximal  non-qualifying  sets;  these  are  the  sets  of  all  parties  such  that 
if  you  add  a  single  new  party  to  each  set  you  will  obtain  a  qualifying  set.  If  we  label  these  sets 
Ai, . . .  ,At,  we  then  form  their  set-theoretic  complements,  i.e.  B{  —  V\A{.  Then  a  set  of  secret 
shares  S{  is  then  generated,  one  for  each  set  P^,  so  that 

5  =  5i  0  •  •  •  0  St. 

A  party  is  given  the  share  Si  if  it  is  contained  in  the  set  B{. 


Example:  The  sets  of  maximal  non-qualifying  sets  for  our  first  example  are 

Ai  =  {P,  V,  Sj ,  A2  =  {V,  G}  and  A3  =  {A,  G}. 

Forming  their  complements  we  obtain  the  sets 

Pi  =  {<T},  B2  =  {P,  S}  and  P3  =  {P,  V}. 

We  generate  three  shares  5i,  52  and  53  such  that  5  =  si  0  52  0  53  and  then  define  the  shares  as 


sp  i —  s2 


53, 


5y  <—  53 , 


55  52, 
5<C  5i. 


Again  we  can  check  that  only  the  qualifying  sets  can  recover  the  secret. 


Example:  For  the  2-out-of-4  threshold  access  structure  we  obtain  the  following  maximal  non¬ 
qualifying  sets 

Ai  =  {A},  A2  =  {P},  A3  =  { C }  and  A4  =  {P}. 

On  forming  their  complements  we  obtain 

Pi  =  {P,  C,  P},  P2  =  {A,  C,  P},  P3  =  {A,  P,  P}  and  P4  =  {A,  P,  C}. 
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We  form  the  four  shares  such  that  s  =  si  0  S2  0  S3  ®  S4  and  set 


sa 

SB 

sc 

sd 


S2  53  S4, 
Si  ||^3  ||S4, 
Si  || ^2  ||^4, 
Si  S2  53. 


Whilst  the  above  two  constructions  provide  a  mechanism  to  construct  a  secret  sharing  scheme 
for  any  monotone  access  structure,  they  appear  to  be  very  inefficient.  In  particular  for  the  threshold 
access  structure  they  are  especially  bad,  especially  as  the  number  of  parties  increases.  In  the  rest  of 
this  chapter  we  will  examine  a  very  efficient  mechanism  for  threshold  secret  sharing  due  to  Shamir, 
called  Shamir  secret  sharing.  This  secret  sharing  scheme  is  itself  based  on  the  ideas  behind  certain 
error-correcting  codes,  called  Reed-Solomon  codes.  So  we  will  first  have  a  little  digression  into 
coding  theory. 


19.3.  Reed-Solomon  Codes 


An  error-correcting  code  is  a  mechanism  to  transmit  data  from  A  to  B  such  that  any  errors  which 
occur  during  transmission,  for  example  due  to  noise,  can  be  corrected.  They  are  found  in  many 
areas  of  electronics:  they  are  the  thing  which  makes  your  CD/DVD  resistant  to  minor  scratches; 
they  make  sure  that  RAM  chips  preserve  your  data  correctly;  they  are  used  for  communication 
between  Earth  and  satellites  or  deep  space  probes. 

A  simpler  problem  is  that  of  error  detection.  Here  one  is  only  interested  in  whether  the  data 
have  been  altered  or  not.  A  particularly  important  distinction  to  make  between  the  area  of  coding 
theory  and  cryptography  is  that  in  coding  theory  one  can  select  simpler  mechanisms  to  detect  errors. 
This  is  because  in  coding  theory  the  assumption  is  that  the  errors  are  introduced  by  random  noise, 
whereas  in  cryptography  any  errors  are  thought  to  be  actively  inserted  by  an  adversary.  Thus  in 
coding  theory  error  detection  mechanisms  can  be  very  simple,  whereas  in  cryptography  we  have  to 
resort  to  complex  mechanisms  such  as  MACs  and  digital  signatures. 

Error  correction  on  the  other  hand  is  not  only  interested  in  detecting  errors,  it  also  wants  to 
correct  those  errors.  Clearly  one  cannot  correct  all  errors,  but  it  would  be  nice  to  correct  a  certain 
number.  A  classic  way  of  forming  error-correcting  codes  is  via  Reed-Solomon  codes.  In  coding 
theory  such  codes  are  usually  presented  over  a  finite  held  of  characteristic  two.  However,  we  are 
interested  in  the  general  case  and  so  we  will  be  using  a  code  over  Fg,  for  a  prime  power  q.  Each 
code- word  is  a  vector  over  ¥q,  with  the  vector  length  being  called  the  length  of  the  code- word.  The 
set  of  all  valid  code-words  forms  the  code,  there  is  a  mapping  from  the  set  of  valid  code-words  to 
the  data  one  wants  to  transmit.  The  invalid  code- words,  i.e.  vectors  which  are  not  in  the  code,  are 
used  to  perform  error  detection  and  correction. 

To  define  a  Reed-Solomon  code  we  also  require  two  integer  parameters,  n  and  t.  The  value  n 
defines  the  length  of  each  code- word,  whilst  the  number  t  is  related  to  the  number  of  errors  we  can 
correct;  indeed  we  will  be  able  to  correct  (n  —  t  —  l)/2  errors.  We  also  define  a  set  X  C  F  of  size  n. 
If  the  characteristic  of  F q  is  larger  than  n  then  we  can  select  X  =  {1,2,.. . ,  n},  although  any  subset 
of  F q  will  do.  For  our  application  to  Shamir  secret  sharing  later  on  we  will  assume  that  0  0  X. 

Consider  the  set  of  polynomials  of  degree  less  than  or  equal  to  t  over  the  held  ¥q. 

P  =  {/o  +  fi  ■  X  +  . . .  +  ft  ■  X1  :  fi  e  F,}  . 


The  set  P  represents  the  number  of  code-words  in  our  code,  i.e.  the  number  of  different  data  items 
which  we  can  transmit  in  any  given  block,  and  hence  this  number  is  equal  to  gt+1.  To  transmit 
some  data,  in  a  set  of  size  P  ,  we  first  encode  it  as  an  element  of  P  and  then  we  translate  it  into  a 
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code-word.  To  create  the  actual  code-word  we  evaluate  the  polynomial  at  all  elements  in  X.  Hence, 
the  set  of  actual  code- words  is  given  by 

C  =  {(/Cl):  •  •  • ,  /Gn))  :  /  €  P,  Xi  G  5}  . 

The  size  (in  bits)  of  a  code- word  is  then  ndog2  q.  So  we  require  n dog2  q  bits  to  represent  (t+l)dog2  q 
bits  of  information. 

Example:  As  an  example  consider  the  Reed-Solomon  code  with  parameters  q  =  101,  n  —  7,  t  =  2 
and  X  =  {1,  2,  3, 4,  5,  6,  7}.  Suppose  our  “data” ,  which  we  represent  by  an  element  of  P,  is  given  by 
the  polynomial 

/  =  20  +  57- X +  68 -X2. 

To  transmit  this  data  we  compute  f(i)  (mod  q)  for  i  —  1, . . . ,  7,  to  obtain  the  code- word 

c  —  (44,2,96,23,86,83,14). 

This  code-word  can  now  be  transmitted  or  stored. 

19.3.1.  Data  Recovery:  At  some  point  the  code- word  will  need  to  be  converted  back  into  the 
data.  In  other  words  we  have  to  recover  the  polynomial  in  P  from  the  set  of  points  at  which  it  was 
evaluated,  i.e.  the  vector  of  values  in  C.  We  will  first  deal  with  the  simple  case  and  assume  that 
no  errors  have  occurred.  The  receiver  is  given  the  data  c  —  (ci, . . . ,  cn)  but  has  no  idea  as  to  the 
underlying  polynomial  /.  Thus  from  the  receiver’s  perspective  he  wishes  to  find  the  fi  such  that 

t 

3=0 

It  is  well  known,  from  high  school,  that  a  polynomial  of  degree  at  most  t  is  determined  completely 
by  its  values  at  t  + 1  points.  So  as  long  as  t  <  n  we  can  recover  /  when  no  errors  occur;  the  question 
is  how? 

First  note  that  the  receiver  can  generate  n  linear  equations  via 

Ci  =  f{xi)  for  Xi  G  X. 

In  other  words  he  has  the  system  of  equations: 

Cl  —  fo  +  /l  *  X\  -t - 1 -ff  x\, 


C-n  —  f()  +  ,/l  '  xn  +  '''  +  /('  A- 

So  by  solving  this  system  of  equations  over  the  held  ¥q  we  can  recover  the  polynomial  and  hence 
the  data. 

Actually  the  polynomial  /  can  be  recovered  without  solving  the  linear  system,  via  the  use  of 
Lagrange  interpolation.  Suppose  we  first  compute  the  polynomials 

Si(X)  TT  - -,  1  <  i  <  n. 

X  X  rp  .  _  rp  . 

7  e  Ay  n 

Note  that  we  have  the  following  properties,  for  all  z, 

•  5  i(xi)  =  1. 

•  5i(xj)  =  0,  if  i  /  j. 

•  deg  St  ( X )  =  n  —  1. 


19.3.  REED-SOLOMON  CODES 


409 


Lagrange  interpolation  takes  the  values  q  and  computes 

n 

f(X)  <-£  £*•*(*). 

i= 1 

The  three  properties  above  of  the  polynomials  Si(X)  translate  into  the  following  facts  about  /(X): 

•  fix*)  =  Cj  for  all  i. 

•  deg /(X)  <n-  1. 

Hence,  Lagrange  interpolation  finds  the  unique  polynomial  which  interpolates  the  n  elements  in 
the  code-word. 


19.3.2.  Error  Detection:  We  see  that  by  using  Lagrange  interpolation  on  the  code- word  we  will 
recover  a  polynomial  of  degree  t  when  there  are  no  errors,  but  when  there  are  errors  in  the  received 
code-word  we  are  unlikely  to  obtain  a  valid  polynomial,  i.e.  an  element  of  P.  Hence,  we  instantly 
have  an  error-detection  algorithm:  we  apply  the  method  above  to  recover  the  polynomial,  assuming 
it  is  a  valid  code-word,  then  if  the  resulting  polynomial  has  degree  greater  than  t  we  can  conclude 
that  an  error  has  occured. 

Returning  to  our  example  parameters  above,  suppose  the  following  code-word  was  received 

c  =  (44,2,25,23,86,83,14). 


In  other  words  it  is  equal  to  the  sent  code-word  except  in  the  third  position  where  a  96  has  been 
replaced  by  a  25.  We  compute  once  and  for  all  the  polynomials,  modulo  q  =  101, 


Ji(X)  =  70-X6  +  29 
<52(X)  =  85  •  X6  +  12 
S3(X)  =  40  ■  X6  +  10 
84{X)  =  14-X6  +  68 
<55(X)  =  40  •  X®  +  90 
S6(X)  =  85  ■  X6  +  49 
S7(X)  =  70-X6  +  45 


•X5 

+ 

46 

•X4 

+ 

4- 

CO 

1 

•X5 

+ 

23 

•X4 

+ 

96 

•X3 

•X5 

+ 

83 

•X4 

+ 

23 

•X3 

•X5 

+ 

33 

•X4 

+ 

63 

•X3 

•X5 

+ 

99 

•X4 

+ 

67 

•X3 

•X5 

+ 

91 

•X4 

+ 

91 

•X3 

•X5 

+ 

29 

•X4 

+ 

60 

•X3 

-  43  •  X2  +  4  •  X  +  7, 

+  59  •  X2  +  49  •  X  +  80, 
+  48  •  X2  +  64  •  X  +  35, 
+  78  •  X2  +  82  •  X  +  66, 
+  11  •  X2  +  76  •  X  +  21, 
+  9  •  X2  +  86  •  X  +  94, 
+  55  •  X2  +  43  •  X  +  1. 


The  receiver  now  tries  to  recover  the  sent  polynomial  given 


the  data  he  has  received.  He  obtains 


/(X)  <r-  44  •  4 i(X)  +  •  •  •  +  14  •  S7(X) 

=  60  +  58  •  X  +  94  •  X2  +  84  •  X3  +  66  •  X4  +  98  •  X5  +  89  •  X6. 


But  this  is  a  polynomial  of  degree  six  and  not  of  degree  t  —  2.  Hence,  the  receiver  knows  that  there 
is  at  least  one  error  in  the  code  word  that  he  has  received.  He  just  does  not  know  which  position 
is  in  error,  nor  what  its  actual  value  should  be. 


19.3.3.  Error  Correction:  The  intuition  behind  error  correction  is  the  following.  Consider  a 
polynomial  of  degree  three  over  the  reals  evaluated  at  seven  points,  such  as  that  in  Figure  19.1. 
Clearly  there  is  only  one  cubic  curve  which  interpolates  all  of  the  points,  since  we  have  specified 
seven  of  them  and  we  only  need  four  such  points  to  define  a  cubic  curve.  Now  suppose  one  of  these 
evaluations  is  given  in  error,  for  example  the  point  at  x  =  3,  as  in  Figure  19.2.  We  see  that  we 
still  have  six  points  on  the  cubic  curve,  and  so  there  is  a  unique  cubic  curve  passing  through  these 
six  valid  points.  However,  suppose  we  took  a  different  set  of  six  points,  i.e.  five  valid  ones  and 
one  incorrect  one.  It  is  then  highly  likely  that  the  curve  which  goes  through  the  second  set  of  six 
points  would  not  be  cubic.  In  other  words  because  we  have  far  more  valid  points  than  we  need  to 
determine  the  cubic  curve,  we  are  able  to  recover  it. 
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Figure  19.1.  Cubic  function  evaluated  at  seven  points 


Figure  19.2.  Cubic  function  going  through  six  points  and  one  error  point 

We  return  to  the  general  case  of  a  polynomial  of  degree  t  evaluated  at  n  points.  Suppose  we 
know  that  there  are  at  most  e  errors  in  our  code- word.  We  then  have  the  following  (very  inefficient) 
method  for  polynomial  reconstruction. 

•  We  produce  the  list  of  all  subsets  X  of  the  n  points  with  n  —  e  members. 

•  We  then  try  to  recover  a  polynomial  of  degree  t.  If  we  are  successful  then  there  is  a  good 
probability  that  the  subset  X  is  the  valid  set;  if  we  are  unsuccessful  then  we  know  that  X 
contains  an  element  which  is  in  error. 

This  is  clearly  a  silly  algorithm  to  error  correct  a  Reed-Solomon  code.  The  total  number  of  subsets 
we  may  have  to  take  is  given  by 

77  I 

n/~i  _ 

°n-e  —  .  /  \  |  • 

e!  •  [n  —  e)l 

But  despite  this  we  will  still  analyse  this  algorithm  a  bit  more:  To  be  able  to  recover  a  polynomial 
of  degree  t  we  must  have  that  t  <  n  —  e,  i.e.  we  must  have  more  valid  elements  than  there  are 
coefficients  to  determine.  Suppose  we  not  only  have  e  errors  but  we  also  have  5  “erasures”,  i.e. 
positions  for  which  we  do  not  even  receive  the  value.  Note  that  erasures  are  slightly  better  for  the 
receiver  than  errors,  since  with  an  erasure  the  receiver  knows  the  position  of  the  erasure,  whereas 
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with  an  error  they  do  not.  To  recover  a  polynomial  of  degree  t  we  will  require 

t  <  n  —  e  —  s. 

But  we  could  recover  many  such  polynomials,  for  example  iit  =  n  —  e  —  s  —  1  then  all  such  sets  X 
of  n  —  e  —  s  will  result  in  a  polynomial  of  degree  at  most  t.  To  obtain  a  unique  polynomial  from 
the  above  method  we  will  need  to  make  sure  we  do  not  have  too  many  errors /erasures. 

It  can  be  shown  that  if  we  can  obtain  at  least  t  +  2  •  e  points  then  we  can  recover  a  unique 
polynomial  of  degree  t  which  passes  through  n  —  s  —  e  points  of  the  set  of  n  —  s  points.  This  gives  the 
important  equation  that  an  error  correction  for  Reed-Solomon  codes  can  be  performed  uniquely 
provided 

n>t  +  2-  e  +  s. 

The  only  problem  left  is  how  to  perform  this  error  correction  efficiently. 


19.3.4.  The  Berlekamp— Welch  Algorithm:  We  now  present  an  efficient  method  to  perform 
error  correction  for  Reed-Solomon  codes  called  the  Berlekamp- Welch  algorithm.  The  idea  is  to 
interpolate  a  polynomial  in  two  variables  through  the  points  which  we  are  given.  Suppose  we  are 
given  a  code  word  with  5  missing  values,  and  the  number  of  errors  is  bounded  by 

n  —  s 
e<t<— 

This  means  we  are  actually  given  n  —  s  supposed  values  of  yi  =  fixf).  We  know  the  pairs  ( Xi,yi ) 
and  we  know  that  at  most  e  of  them  are  wrong,  in  that  they  do  not  come  from  evaluating  the 
hidden  polynomial  f(X).  The  goal  of  error  correction  is  to  try  to  recover  this  hidden  polynomial. 
We  consider  the  bivariate  polynomial 

Q(X,Y)  =  MX)  -  MX)  -Y, 

where  /o  (resp.  /i)  is  a  polynomial  of  degree  at  most  2  •  t  (resp.  t).  We  impose  the  condition  that 
/i(0)  =  1.  We  treat  the  coefficients  of  the  f  as  variables  which  we  want  to  determine.  Due  to  the 
bounds  on  the  degrees  of  the  two  polynomials,  and  the  extra  condition  of  /i(0)  =  1,  we  see  that 
the  number  of  variables  we  have  is 


v  =  (2  •  t  +  1)  +  (t  +  1)  —  1  =  3  •  t  +  1. 

We  would  like  the  bivariate  polynomial  Q(X,Y)  to  interpolate  our  points  ( x^yi ).  By  substituting 
in  the  values  of  X{  and  yi  we  obtain  a  linear  equation  in  terms  of  the  unknown  coefficients  of  the 
polynomials  f.  Since  we  have  n  —  s  such  points,  the  number  of  linear  equations  we  obtain  is  n  —  s. 
After  determining  /o  and  fi  we  then  compute 

/<-  f 

h 

To  see  that  this  results  in  the  correct  answer,  consider  the  single  polynomial  in  one  variable 


P(X)=Q(XJ(X)) 


where  f(X )  is  the  polynomial  we  are  trying  to  determine.  We  have  deg  P(X)  <2 •£.  The  polynomial 
P(X)  clearly  has  at  least  n  —  s  —  e  zeros,  i.e.  the  number  of  valid  pairs.  So  the  number  of  zeros  is 
at  least 


n  —  s  —  e  >  n  —  e  —  t  >  3  •  t  —  t  =  2  •  t, 

since  e  <  t  <  AA.  Thus  P(X)  has  more  zeros  than  its  degree,  and  it  must  hence  be  the  zero 
polynomial.  Hence, 

/o  -  fi  ■  f  =  o 


and  so  /  =  /0//1  since  fi  ^  0. 
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Example:  Again  consider  our  previous  example.  We  have  received  the  invalid  code- word 

c  =  (44,2,25,23,86,83,14). 

We  know  that  the  underlying  code  is  for  polynomials  of  degree  t  =  2.  Hence,  since  2  =  t  < 

7/3  =  2.3  we  should  be  able  to  correct  a  single  error.  Using  the  method  above  we  want  to  determine 
the  polynomial  Q(X,  Y)  of  the  form 

Q(X,  Y )  =  /0>o  +  /i,0  •  X  +  /2, o  ■  X2  +  /3, o  •  X3  +  /4, o  •  X4  -  (1  +  /1>x  •  X  +  /2>1  ■  X2)  ■  X 

which  passes  through  the  seven  given  points.  Hence  we  have  six  variables  to  determine  and  we  are 
given  seven  equations.  These  equations  form  the  linear  system,  modulo  q  =  101, 


l 1 

1 

l2 

l3 

l4 

-44*1 

-44 

(  ho  \ 

/44\ 

1 

2 

22 

23 

24 

-2  •  2 

-2  • 

22 

/l,0 

2 

1 

3 

32 

33 

34 

-25-3 

-25 

•  32 

ho 

25 

1 

4 

42 

43 

44 

to 

CO 

-23 

•  42 

• 

ho 

23 

1 

5 

52 

53 

54 

40 

CO 

00 

-86 

•  52 

ho 

86 

1 

6 

62 

63 

64 

-83-6 

-83 

•  62 

/i,i 

83 

\1 

7 

72 

73 

74 

-14*  7 

-14 

•  72  / 

\  /2,1  / 

V  i4  7 

So  we  are  solving  the  system 


(l 

1 

1 

1 

1 

57 

5U 

(  /o,0  N 

/44\ 

1 

2 

4 

8 

16 

97 

93 

/l,0 

2 

1 

3 

9 

27 

81 

26 

78 

ho 

25 

1 

4 

16 

64 

54 

9 

36 

• 

fo,o 

— 

23 

1 

5 

25 

24 

19 

75 

72 

fi,0 

86 

1 

6 

36 

14 

84 

7 

42 

/i,i 

83 

\1 

7 

49 

40 

78 

3 

21  ) 

V  hi  / 

V  i4  7 

(mod  101) 


We  obtain  the  solution 

(/o.o,  /i,o,  h, o,  h, o,  /4,o,  /i,i,  /2,i)  <-  (20, 84, 49, 11,0, 67, 0) , 
and  hence  the  two  polynomials 

/o(X)  <-  20  +  84  •  X  +  49  •  X2  +  11  •  X3  and  /i(X)  ^  1  +  67  •  X. 

We  hnd  that 

/(X)  «-  =  20  +  57  •  X  +  68  •  X2, 

which  is  precisely  the  polynomial  we  started  with  at  the  beginning  of  this  section.  Hence,  we  have 
corrected  for  the  error  in  the  transmitted  code-word. 


19.4.  Shamir  Secret  Sharing 

We  now  return  to  secret  sharing  schemes,  and  in  particular  the  Shamir  secret  sharing  scheme.  We 
suppose  we  have  n  parties  who  wish  to  share  a  secret  so  that  no  t  (or  fewer)  parties  can  recover 
the  secret.  Hence,  this  is  going  to  be  a  (t  +  l)-out-of-n  threshold  secret  sharing  scheme. 

First  we  suppose  there  is  a  trusted  dealer  who  wishes  to  share  the  secret  s  in  ¥q.  He  first 
generates  a  secret  polynomial  f(X)  of  degree  t  with  /( 0)  =  s.  That  is,  he  generates  random 
integers  fi  in  ¥p  for  i  —  1, . . . ,  t  and  sets 

/(X)  =  s  +  /i  ■  x  +  •  ■  ■  +  ft  •  xK 
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The  trusted  dealer  then  identifies  each  of  the  n  players  by  an  element  in  a  set  X  C  F  \  {0},  for 
example  we  could  take  X  =  {1,  2, . . . ,  n}  if  the  characteristic  of  F  was  larger  than  n.  Then  if  i  G  X, 
party  i  is  given  the  share  Si  Y-  /(i).  Notice  that  the  vector 

(s,  Si , . . . ,  sn) 


is  a  code-word  for  a  Reed-Solomon  code.  Also  note  that  if  t  +  1  parties  come  together  then  they 
can  recover  the  original  polynomial  via  Lagrange  interpolation  and  hence  the  secret  s.  Actually, 
secret  reconstruction  can  be  performed  more  efficiently  by  making  use  of  the  equation 

n 

S  /(0)  =  Si  •  Si  (0) . 
i=  1 


Hence,  for  a  set  Y  C  X,  we  define  the  vector  ry  by  ry  =  {rXiy)Xiey  to  be  the  public  “recombination” 
vector,  where 


rXi,Y  — 


n 


-Xj 


Xj£Y,Xj^Xi 


Xi 


Xj 


Then,  if  we  obtain  a  set  of  shares  from  a  subset  FcXof  the  players,  with  >  t ,  we  can  recover 
s  via  the  simple  summation. 

s  rxi,Y '  Si. 

Xi£Y 


Also  note  that  if  we  receive  some  possible  values  from  a  set  of  parties  Y,  then  we  can  recover 
the  original  secret  via  the  Berlekamp-Welch  algorithm  for  decoding  Reed-Solomon  codes  in  the 
presence  of  errors,  assuming  the  number  of  invalid  shares  is  bounded  by  e  where 


e  <  t  < 


3 


Shamir  secret  sharing  is  an  example  of  a  secret  sharing  scheme  which  can  be  made  into  a  pseudo¬ 
random  secret  sharing  scheme,  or  PRSS.  This  is  a  secret  sharing  scheme  which  allows  the  parties 
to  generate  a  sharing  of  a  random  value  with  almost  no  interaction.  In  particular  the  interaction 
can  be  restricted  to  a  set-up  phase  only. 

To  define  the  Shamir  pseudo-random  secret  sharing  scheme  we  take  our  n  parties,  which  we 
shall  (for  sake  of  concreteness)  label  by  the  set  X  =  {1,  2, . . .  ,  n},  and  threshold  value  t  +  1.  Then 
for  every  subset  A  C  X  of  size  n  —  t  we  define  the  polynomial  /a (A)  of  degree  t  by  the  conditions 


/a(0)  =  1  and  /a  (2)  =  0  for  all  i  G  X  \  A. 


In  the  initialization  phase  for  our  PRSS  we  create  a  secret  value  r a  £  A,  where  S  is  some  key  space. 
For  each  subset  A ,  the  value  of  r a  is  securely  distributed  to  every  player  in  A.  The  n  parties  also 
agree  on  a  public  pseudo-random  function  which  is  keyed  by  the  secret  values 

,  f  S  x  S  — >  ¥q 

ip  :  < 

(rA,a)  1 — >  ip(rA,a). 

Now  suppose  the  parties  wish  to  generate  a  new  secret  sharing  of  a  random  value.  By  some  means, 
either  by  interaction  or  by  prearrangement  (e.g.  a  counter  value),  they  select  a  public  random  value 
a  G  S.  They  then  generate  a  random  Shamir  sharing  where  the  underlying  polynomial  of  degree  t 
is  given  by 

f(X)=  £  yrA,a)-fA(X), 

AcY,\A\=n—t 

where  the  sum  is  over  all  subsets  A  of  size  n  —  t.  This  means  that  each  party  i  receives  the  share 

Si  <r-  ^2  ^{rA,a)-fA(i) 

zGAcX,|  A\=n— t 
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where  the  sum  is  over  all  subsets  A  of  size  n  —  t  which  contain  the  element  i.  Finally  the  random 
value  which  is  shared,  via  the  Shamir  secret  sharing  scheme,  is  given  by 

S=  22  ^(rA,a)  ■  fA(0)  =  22  ^{rA,a). 

AcX,|A|=n— t  AcX,|A|=n— t 

Other  secret  sharing  schemes  can  also  be  turned  into  PRSSs.  Consider  the  n-out-of-n  scheme 
over  F  for  which  the  secret  is  given  by  s  =  s\  +  •  •  •  +  sn,  where  Si  is  chosen  uniformly  at  random 
from  the  held  ¥q.  This  is  immediately  a  PRSS,  since  to  generate  a  sharing  of  a  random  value 
unknown  to  any  party,  each  party  only  needs  to  generate  a  random  value. 

In  Chapter  22  we  shall  require  not  only  a  pseudo-random  secret  sharing,  but  also  a  variant  called 
pseudo-random  zero  sharing,  or  PRZS,  for  the  Shamir  secret  sharing  scheme.  In  pseudo-random 
zero  sharing  we  wish  to  generate  random  sharings  of  the  value  zero,  with  respect  to  a  polynomial  of 
degree  2  -t.  The  exact  reason  why  the  polynomial  has  to  be  of  degree  2  -t  will  become  apparent  when 
we  discuss  our  application  in  Chapter  22.  To  enable  this  extra  functionality  we  require  exactly  the 
same  set-up  as  for  the  PRSS,  but  now  we  use  a  different  pseudo-random  function, 

Sx  Sx  {1,...^}  — >  F q 

( VA,X,j )  I »  1p(r  A,X,j). 

Then  to  create  a  degree  2  •  t  Shamir  secret  sharing  of  zero,  the  parties  pick  a  number  a  as  before. 
The  underlying  polynomial  is  then  given  by 

f(x)=  22  W'/VX) 

AcX,|A|=n— t  \j=l 

Clearly  this  is  a  polynomial  which  shares  the  zero  value,  as  the  polynomial  is  divisible  by  X.  The 
share  for  party  i  is  given  by 

Si  <r-  *22  (  Xb(r^’a4)  'iJ  ‘/AO 

AcX,|A|=n— t  \j= 1 


19.5.  Application:  Shared  RSA  Signature  Generation 

We  shall  now  present  a  simple  application  of  a  secret  sharing  scheme,  which  has  applications  in  the 
real  world.  Having  introduced  digital  certificates  in  Chapter  18,  we  present  an  application  of  secret 
sharing  to  distributed  RSA  signatures.  Suppose  a  company  is  setting  up  a  certificate  authority  to 
issue  RSA  signed  certificates  to  its  employees  to  enable  them  to  access  various  corporate  services. 
It  considers  the  associated  RSA  private  key  to  be  highly  sensitive,  after  all  if  the  private  key 
was  compromised  then  the  company’s  entire  corporate  infrastructure  could  also  be  compromised. 
Suppose  the  public  key  is  (A,  e)  and  the  private  key  is  d. 

The  company  decides  that  to  mitigate  the  risk  it  will  divide  the  private  key  into  three  shares 
and  place  the  three  shares  on  three  different  continents.  Thus,  for  example,  there  will  be  one  server 
in  Asia,  one  in  America  and  one  in  Europe.  As  soon  as  the  RSA  key  is  generated,  the  company 
generates  three  integers  d\ ,  d 2  and  d$  such  that 

d  =  di  +  <^2  +  ^3  (mod  0(A)). 

The  company  then  removes  all  knowledge  of  d  and  places  d\  on  a  secure  computer  in  Asia,  cfo  on 
a  secure  compute  in  America  and  d%  on  a  secure  computer  in  Europe. 

Now  an  employee  wishes  to  obtain  a  digital  certificate.  This  is  essentially  the  RSA  signature 
on  a  (probably  hashed)  string  m.  The  employee  simply  sends  the  string  m  to  the  three  computers, 
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which  respond  with 


si  mdi  for  i  =  1,  2, 3. 


The  valid  RSA  signature  is  then  obtained  by  multiplying  the  three  shares  together,  i.e. 

s  <—  Si  •  S2  •  53  =  ngh+cb+cfe  —  md . 


Here,  we  have  used  the  fact  that  the  RSA  function  is  multiplicatively  homomorphic. 

This  scheme  appears  to  solve  the  problem  of  not  putting  the  master  signature  key  in  only  one 
location.  However,  the  employee  now  needs  the  three  servers  to  be  online  in  order  to  obtain  his 
certificate.  It  would  be  much  nicer  if  only  two  had  to  be  online,  since  then  the  company  could  cope 
with  outages  of  servers.  The  problem  is  that  the  above  scheme  essentially  implements  a  3-out-of-3 
secret  sharing  scheme,  whereas  what  we  want  is  a  2-out-of-3.  Clearly,  we  need  to  apply  something 
along  the  lines  of  Shamir  secret  sharing.  However,  the  problem  is  that  the  number  0(A)  needs  to 
be  kept  secret,  and  the  denominators  in  the  Lagrange  interpolation  formulae  may  not  be  coprime 
to  0(A). 

There  have  been  many  solutions  proposed  to  the  above  problem  of  threshold  RSA,  however,  the 
most  elegant  and  simple  is  due  to  Shoup.  Suppose  we  want  a  t-out-of-n  sharing  of  the  RSA  secret 
key  d,  where  we  assume  that  e  is  chosen  so  that  it  is  a  prime  and  e  >  n.  We  adapt  the  Shamir 
scheme  as  follows:  Firstly  a  polynomial  of  degree  t  —  1  is  chosen,  by  selecting  fi  modulo  0(A)  at 
random,  to  obtain 

f(X)  =  d  +  f1-X  +  ---  +  ft_1-Xt~1. 

Then  each  server  is  given  the  share  d{  —  f(i ).  The  number  of  parties  n  is  assumed  to  be  fixed  and 
we  define  A  to  be  the  constant  A  =  n\. 

Now  suppose  a  user  wishes  to  obtain  a  signature  on  the  message  m,  i.e  it  wants  to  compute  md 
(mod  A ).  It  sends  m  to  each  server,  which  then  computes  the  signature  fragment  as 

Si  =  m2'A'di  (mod  A ). 


These  signature  fragments  are  then  sent  back  to  the  user.  Suppose  now  that  the  user  obtains 
fragments  back  from  a  subset  Y  =  {H,  •  •  • ,  it}  C  {1, . . . ,  n},  of  size  greater  than  or  equal  to  t.  This 
set  defines  a  “recombination”  vector  r y  —  (?N,y)qeT  defined  by 


rijx  <-  n 

GT ,ij  tU/c 


^ k 


We  really  want  to  be  able  to  compute  this  modulo  0(A),  but  that  is  impossible  since  0(A )  is  not 
known  to  any  of  the  participants.  In  addition  the  denominator  may  not  be  invertible  modulo  0(A). 
However,  we  note  that  the  denominator  in  the  above  divides  A  and  so  we  have  that  A  •  rq.?y  G  Z. 
Hence,  the  user  can  compute 

a  <—  sij2Arij,Y  (mod  A ). 
ijZY 


We  find  that  this  is  equal  to 


(7  = 


£, 


Y 


■dr 


m 


4-A  2-d 


(mod  A), 


with  the  last  equality  working  due  to  Lagrange  interpolation  modulo  0(A).  From  this  partial 
signature  we  need  to  recover  the  real  signature.  To  do  this  we  use  the  fact  that  we  have  assumed 
that  e  >  n  and  that  e  is  a  prime.  These  latter  two  facts  mean  that  e  is  coprime  to  4  •  A2,  and  so 
via  the  extended  Euclidean  algorithm  we  can  compute  integers  u  and  v  such  that 

u  •  e  T  v  •  4  •  A2  =  1, 
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from  which  the  signature  is  computed  as 

s  mu  •  <jv  (mod  N). 

That  s  is  the  valid  RSA  signature  for  this  public/private  key  pair  can  be  verified  since 

se  =  (to“  •  av)e  =  me'u  ■  ml'e'v'A2'd, 

=  mu'e+A-v ’A2  =  to. 

One  problem  with  the  protocol  as  we  have  described  it  is  that  the  signature  shares  S{  may  be 
invalid.  See  the  paper  by  Shoup  in  the  Further  Reading  section  to  see  how  this  problem  can  be 
removed  using  zero- knowledge  proofs. 


Chapter  Summary 


•  We  have  defined  the  general  concept  of  secret  sharing  schemes  and  shown  how  these  can 
be  constructed,  albeit  inefficiently,  for  any  access  structure. 

•  We  have  introduced  Reed-Solomon  error-correcting  codes  and  presented  the  Berlekamp- 
Welch  decoding  algorithm. 

•  We  presented  Shamir’s  secret  sharing  scheme,  which  produces  a  highly  efficient,  and  secure, 
secret  sharing  scheme  in  the  case  of  threshold  access  structures. 

•  We  extended  the  Shamir  scheme  to  give  both  pseudo-random  secret  sharing  and  pseudo¬ 
random  zero  sharing. 

•  Finally  we  showed  how  one  can  adapt  the  Shamir  scheme  to  enable  the  creation  of  a 
threshold  RSA  signature  scheme. 


Further  Reading 

Shamir’s  secret  sharing  scheme  is  presented  in  his  short  ACM  paper  from  1979.  Shoup’s  threshold 
RSA  scheme  is  presented  in  his  Eurocrypt  2000  paper;  this  paper  also  explains  the  occurrence  of 
the  A2  term  in  the  above  discussion,  rather  than  a  single  A  term.  A  good  description  of  secret 
sharing  schemes  for  general  access  structures,  including  some  relatively  efficient  constructions,  is 
presented  in  the  relevant  chapter  in  Stinson’s  book. 

A.  Shamir.  How  to  share  a  secret.  Communications  of  the  ACM,  22,  612-613,  1979. 

V.  Shoup.  Practical  threshold  siqnatures.  In  Advances  in  Cryptology  -  Eurocrypt  2000,  LNCS  1807, 
207-220,  Springer,  2000. 

D.  Stinson.  Cryptography:  Theory  and  Practice.  Third  Edition.  CRC  Press,  2005. 


CHAPTER  20 


Commitments  and  Oblivious  Transfer 


Chapter  Goals 

•  To  present  two  protocols  which  are  carried  out  between  mutually  untrusting  parties. 

•  To  introduce  commitment  schemes  and  give  simple  examples  of  efficient  implementations. 

•  To  introduce  oblivious  transfer,  and  again  give  simple  examples  of  how  this  can  be  per¬ 
formed  in  practice. 


20.1.  Introduction 

In  this  chapter  we  shall  examine  a  number  of  more  advanced  cryptographic  protocols  which  enable 
higher-level  services  to  be  created.  We  shall  particularly  focus  on  protocols  for 

•  commitment  schemes, 

•  oblivious  transfer. 

Whilst  there  is  a  large  body  of  literature  on  these  protocols,  we  shall  keep  our  feet  on  the  ground 
and  focus  on  protocols  which  can  be  used  in  real  life  to  achieve  practical  higher- level  services.  It 
turns  out  that  these  two  primitives  are  in  some  sense  the  most  basic  atomic  cryptographic  primitives 
which  one  can  construct. 

Up  until  now  we  have  looked  at  cryptographic  schemes  and  protocols  in  which  the  protocol 
participants  are  honest,  and  we  are  trying  to  protect  their  interests  against  an  external  adversary. 
However,  in  the  real  world  we  often  need  to  interact  with  people  who  we  do  not  necessarily  trust. 
In  this  chapter  we  examine  two  types  of  protocol  which  are  executed  between  two  parties,  each 
of  whom  may  want  to  cheat  in  some  way.  The  simplistic  protocols  in  this  chapter  will  form  the 
building  blocks  on  which  more  complicated  protocols  will  be  built  in  Chapters  21  and  22.  We  start 
by  focusing  on  commitment  schemes,  and  then  we  pass  to  oblivious  transfer. 

20.2.  Commitment  Schemes 

Suppose  Alice  wishes  to  play  “paper-scissors-stone”  over  the  telephone  with  Bob.  The  idea  of  this 
game  is  that  Alice  and  Bob  simultaneously  choose  one  of  the  set  {paper ,  scissors ,  stone}.  Then 
the  outcome  of  the  game  is  determined  by  the  rules: 

•  Paper  wraps  stone.  Hence  if  Alice  chooses  paper  and  Bob  chooses  stone  then  Alice  wins. 

•  Stone  blunts  scissors.  Hence  if  Alice  chooses  stone  and  Bob  chooses  scissors  then 
Alice  wins. 

•  Scissors  cut  paper.  Hence  if  Alice  chooses  scissors  and  Bob  chooses  paper  then  Alice 
wins. 

If  both  Alice  and  Bob  choose  the  same  item  then  the  game  is  declared  a  draw.  When  conducted 
over  the  telephone  we  have  the  problem  that  whoever  goes  first  is  going  to  lose  the  game. 

One  way  around  this  is  for  the  party  who  goes  first  to  “commit”  to  their  choice,  in  such  a 
way  that  the  other  party  cannot  determine  what  was  committed  to.  Then  the  two  parties  can 
reveal  their  choices,  with  the  idea  that  the  other  party  can  then  verify  that  the  revealing  party  has 
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not  altered  its  choice  between  the  commitment  and  the  revealing  stage.  Such  a  system  is  called  a 
commitment  scheme.  An  easy  way  to  do  this  is  to  use  a  cryptographic  hash  function  as  follows: 

A  — »  B  :  Ha  =  i^(ithl||paper), 

B  — >  A  :  scissors, 

A  — ^  B  :  Ra-,  paper. 

At  the  end  of  the  protocol  Bob  needs  to  verify  that  the  Ha  sent  by  Alice  is  equal  to  H(Ra\\ paper). 
If  the  values  agree  he  knows  that  Alice  has  not  cheated.  The  result  of  this  protocol  is  that  Alice 
loses  the  game  since  scissors  cut  paper. 

Let  us  look  at  the  above  from  Alice’s  perspective.  She  first  commits  to  the  value  paper  by 
sending  Bob  the  hash  value  Ha-  This  means  that  Bob  will  not  be  able  to  determine  that  Alice  has 
committed  to  the  value  paper,  since  Bob  does  not  know  the  random  value  of  Ra  used  and  Bob  is 
unable  to  invert  the  hash  function.  The  fact  that  Bob  cannot  determine  what  value  was  committed 
to  is  called  the  concealing  or  hiding  property  of  a  commitment  scheme. 

As  soon  as  Bob  sends  the  value  scissors  to  Alice,  she  knows  she  has  lost  but  is  unable  to  cheat, 
since  to  cheat  she  would  need  to  come  up  with  a  different  value  of  Ra ,  say  R'A,  which  satisfied 

H(Ra  ||paper)  =  H(R'a\\ stone). 

But  this  would  mean  that  Alice  could  find  collisions  in  the  hash  function,  which  for  a  suitably 
chosen  hash  function  is  believed  to  be  impossible.  Actually  we  require  that  the  hash  function  is 
second  preimage  resistant  in  this  case.  This  property  of  the  commitment  scheme,  that  Alice  cannot 
change  her  mind  after  the  commitment  procedure,  is  called  binding. 

Let  us  now  study  these  properties  of  concealing  and  binding  in  more  detail.  Recall  that  an 
encryption  function  has  information-theoretic  security  if  an  adversary  with  infinite  computing  power 
could  not  break  the  scheme,  whilst  an  encryption  function  is  called  computationally  secure  if  it  is 
only  secure  when  faced  with  an  adversary  with  polynomially  bounded  computing  power.  A  similar 
division  can  be  made  with  commitment  schemes,  but  now  we  have  two  security  properties,  namely 
concealing  and  binding.  One  property  protects  the  interests  of  the  sender,  and  one  property  protects 
the  interests  of  the  receiver.  To  simplify  our  exposition  we  shall  denote  our  abstract  commitment 
scheme  by  a  public  algorithm,  c  =  C(x,r)  which  takes  a  value  xGP  and  some  randomness  r  E  M 
and  produces  a  commitment  c  G  C.  To  decomitment  the  commiter  simply  reveals  the  values  of  x 
and  r.  The  receiver  then  checks  that  the  two  values  produce  the  original  commitment. 


x,r£Px!  - 

xV'ePxR  <+ -  q 

Win  if  C{x,r)  =  C(x',r') 

and  x  7^  x'  _ 

Figure  20.1.  Commitment  scheme:  binding  game 

Definition  20.1  (Binding).  A  commitment  scheme  is  said  to  be  information- theoretically  (resp. 
computationally)  binding  if  no  infinitely  powerful  (resp.  computationally  bounded)  adversary  can 
win  the  following  game. 

•  The  adversary  outputs  values  xtP  and  r  G  M. 

•  The  adversary  must  then  output  a  value  x'  /  x  and  a  value  r'tl  such  that 

C{pc,  r )  =  C(x)  r). 
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This  game  is  given  graphically  in  Figure  20.1.  If  the  commitment  scheme  II  is  computationally 
binding  then  we  can  define  an  advantage  statement  which  is  defined  as 

Advnnd  =  Pr  [A  wins  the  binding  game  . 

For  information-theoretically  binding  schemes  such  an  advantage  will  be  zero  by  definition. 


Figure  20.2.  Commitment  scheme:  concealing  game 


Definition  20.2  (Concealing).  A  commitment  scheme  is  said  to  be  information-theoretically  (resp. 
computationally)  concealing  if  no  infinitely  powerful  (resp.  computationally  bounded)  adversary  can 
win  the  following  game. 

•  The  adversary  outputs  two  messages  xq  and  x\  of  equal  length. 

•  The  challenger  generates  r*  E  M  at  random  and  a  random  bit  b  E  {0, 1}. 

•  The  challenger  computes  c*  =  C(x5,r*)  and  passes  c*  to  the  adversary. 

•  The  adversary’s  goal  now  is  to  guess  the  bit  b. 

This  game  is  defined  in  Figure  20.2.  Just  as  with  the  binding  property,  if  the  commitment  scheme 
n  is  computationally  concealing  then  we  can  define  an  advantage  statement  which  is  defined  as 

1 

Pt[A  wins  the  concealing  game]  —  -  . 

For  information-theoretically  concealing  schemes  such  an  advantage  will  be  zero  by  definition. 
Notice  that  this  definition  of  concealing  is  virtually  identical  to  our  definition  of  indistinguishability 
of  encryptions.  A  number  of  results  trivially  follow  from  these  two  definitions. 

Lemma  20.3.  There  exists  no  scheme  which  is  both  information-theoretically  concealing  and  bind¬ 
ing. 

Proof.  Suppose  we  have  a  scheme  which  is  both  information-theoretically  concealing  and  binding, 
and  suppose  the  committer  makes  a  commitment  c  <—  C(x,r).  Since  it  is  information-theoretically 
concealing  there  must  exist  values  x'  and  r'  such  that  c  =  C(pc ' ,  r');  otherwise  an  infinitely  powerful 
receiver  could  break  the  concealing  property.  But  if  Alice  is  also  infinitely  powerful  this  means  she 
can  break  the  binding  property  as  well.  □ 


Adv 


conceal 

n 


2  • 


Lemma  20.4.  Using  the  commitment  scheme  defined  as 


c  <—  H(r\\m), 


for  a  random  value  r,  the  committed  value  m  and  some  cryptographic  hash  function  H  is  at  best 

•  computationally  binding, 

•  information-theoretically  concealing. 


Proof.  All  cryptographic  hash  functions  we  have  met  are  only  computationally  secure  against 
preimage  resistance  and  second  preimage  resistance.  The  binding  property  of  the  above  scheme 
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is  only  guaranteed  by  the  second  preimage  resistance  of  the  underlying  hash  function.  Hence,  the 
binding  property  is  only  computationally  secure. 

The  concealing  property  of  the  above  scheme  is  only  guaranteed  by  the  preimage  resistance  of 
the  underlying  hash  function.  Hence,  the  concealing  property  looks  like  it  should  be  only  compu¬ 
tationally  secure.  However,  if  we  assume  that  the  value  r  is  chosen  from  a  suitably  large  set,  then 
the  fact  that  the  hash  function  should  have  many  collisions  works  in  our  favour  and  in  practice 
we  should  obtain  something  “close”  to  information-theoretic  concealing.  On  the  other  hand  if  we 
assume  that  H  is  a  random  oracle,  then  the  commitment  scheme  is  clearly  information-theoretically 
concealing.  □ 


We  now  turn  to  three  practical  commitment  schemes  which  occur  in  various  real-world  protocols. 
All  are  based  on  a  finite  abelian  group  G  of  prime  order  g,  which  is  generated  by  g.  Two  of  the 
schemes  will  also  require  another  generator  h  G  (g),  where  the  discrete  logarithm  of  h  to  the  base 
g  is  unknown  by  any  user  in  the  system.  To  generate  g  and  h  we  need  to  ensure  that  no  one  knows 
the  discrete  logarithm,  and  hence  it  needs  to  be  done  in  a  verihably  random  manner. 

Verihably  random  generation  of  g  and  h  is  quite  easy  to  ensure,  for  example  for  a  finite  held  F* 
with  q  dividing  p  —  1  we  create  g  as  follows  (with  a  similar  procedure  being  used  to  determine  h): 

•  r  <—  Z. 

•  /  <-  H{r)  G  F*  for  some  cryptographic  hash  function  H. 

•  g  <—  (mod  p ). 

•  If  g  =  1  then  return  to  the  first  stage,  else  output  (r,g). 

This  generates  a  random  element  of  the  subgroup  of  F*  of  order  g,  with  the  property  that  it  is 
generated  verihably  at  random  since  one  outputs  the  seed  r  used  to  generate  the  random  element. 
Thus  anyone  who  wishes  to  verify  that  g  was  generated  in  the  above  manner  can  use  r  to  rerun 
the  algorithm.  Since  we  have  used  a  cryptographic  hash  function  H  we  do  not  believe  that  it  is 
feasible  for  anyone  to  construct  an  r  which  produces  a  group  element  whose  discrete  logarithm  is 
known  with  respect  to  some  other  group  element. 

Given  g,h  we  dehne  two  commitment  schemes,  B(pc)  and  Ba(x),  to  commit  to  an  integer  x 
modulo  g,  and  one,  Ea(x),  to  commit  to  an  element  x  G  (g). 

B{x)  =  gx , 

Ea(x)  =  (ga,x  ■  ha) , 

Ba{ x)  =  hx  ■  ga, 

where  a  is  a  random  integer  modulo  q.  To  reveal  the  commitments  the  user  publishes  the  value  x 
in  the  first  scheme  and  the  pair  (a,  x)  in  the  second  and  third  schemes.  The  value  a  is  called  the 
blinding  value,  since  it  blinds  the  value  of  the  commitment  x  even  to  a  computationally  unbounded 
adversary.  The  scheme  given  by  Ba(x)  is  called  Pedersen’s  commitment  scheme. 

Lemma  20.5.  The  commitment  scheme  B(x)  is  information-theoretically  binding. 

Proof.  Suppose  Alice  having  published  c  =  B(x)  =  gx  wishes  to  change  her  mind  as  to  which 
element  of  Z/gZ  she  wants  to  commit  to.  Alas,  for  Alice  no  matter  how  much  computing  power  she 
has  there  is  mathematically  only  one  element  in  Z/gZ,  namely  x,  which  is  the  discrete  logarithm  of 
the  commitment  c  to  the  base  g.  Hence,  the  scheme  is  clearly  information-theoretically  binding.  □ 

Note  that  this  commitment  scheme  does  not  meet  our  strong  definition  of  security  for  the  concealing 
property,  in  any  way;  after  all  it  is  deterministic.  If  the  space  of  values  from  which  x  is  selected 
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is  large,  then  this  commitment  scheme  could  meet  a  weaker  security  definition  related  to  a  one- 
way-like  property.  We  leave  the  reader  to  produce  a  suitable  definition  for  a  one-way  concealing 
property,  and  show  that  B(x)  meets  this  definition. 

Lemma  20.6.  The  commitment  scheme  Ea(x )  is  information-theoretically  binding  and  computa¬ 
tionally  concealing. 

Proof.  This  scheme  is  exactly  ElGamal  encryption  with  respect  to  a  public  key  h,  for  which  no 
one  knows  the  associated  private  key.  Indeed  any  IND-CPA  secure  public  key  encryption  scheme 
can  be  used  in  this  way  as  a  commitment  scheme. 

The  underlying  IND-CPA  security  implies  that  the  resulting  commitment  scheme  is  computa¬ 
tionally  concealing,  whilst  the  fact  that  the  decryption  is  unique  implies  that  the  commitment 
scheme  is  information-theoretically  binding.  □ 

Lemma  20.7.  The  Pedersen  commitment  scheme,  given  by  Ba(x),  is  computationally  binding  and 
information-theoretically  concealing. 

Proof.  Suppose  the  adversary,  after  having  committed  to  c  f-  Ba[x )  =  hx  •  ga  wishes  to  change 
her  mind,  so  as  to  commit  to  y  instead.  So  the  adversary  outputs  another  pair  (y,  b)  such  that 
c  =  hy  •  gh .  However,  given  these  two  values  we  can  extract  the  discrete  logarithm  of  h  with  respect 
to  g  via 

a  —  b 
y-x' 

Thus  any  algorithm  which  breaks  the  binding  property  can  be  turned  into  an  algorithm  which 
solves  discrete  logarithms  in  the  group  G. 

We  now  turn  to  the  concealing  property.  It  is  clear  that  this  is  information-theoretically  con¬ 
cealing  since  an  all-powerful  adversary  could  extract  the  discrete  logarithm  of  h  with  respect  to  g 
and  then  any  committed  value  c  can  be  opened  to  any  message  x.  □ 


We  end  this  section  by  noticing  that  the  two  discrete-logarithm-based  commitment  schemes  we 
have  given  possess  the  homomorphic  property: 


B(Xl)  ■  B(x2)  =  gXl  •  <f2 


=  9 


X\+X2 


Bai(x i)  •  B 

CL2  O2) 


B(x  1  +  x2), 
hXl  •  gai  ■  hX2  ■  ga 2 


jpc  1+X2  #  gdl~\~CL2 


Bai+a2  (xi  T  X2)  • 


We  shall  use  this  additively  homomorphic  property  when  we  discuss  an  electronic  voting  protocol 
at  the  end  of  Chapter  21. 


20.3.  Oblivious  Transfer 

We  now  consider  another  type  of  basic  protocol,  called  oblivious  transfer  or  OT  for  short.  This 
is  another  protocol  which  is  run  between  two  distrusting  parties,  a  sender  and  a  receiver.  In  its 
most  basic  form  the  sender  has  two  secret  messages  as  input,  mo  and  mi;  the  receiver  has  as  input 
a  single  bit  b.  The  goal  of  an  OT  protocol  is  that  at  the  end  of  the  protocol  the  sender  should 
not  learn  the  value  of  the  receiver’s  input  b.  However,  the  receiver  should  learn  the  value  of  m 5 
but  should  learn  nothing  about  rai_fc.  Such  a  protocol  is  often  called  a  l-out-of-2  OT,  since  the 
receiver  learns  one  of  the  two  inputs  of  the  sender.  Such  a  protocol  is  depicted  in  Figure  20.3.  One 
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can  easily  generalize  this  concept  to  a  h-out-of-n  OT,  but  as  it  is  we  will  only  be  interested  in  the 
simpler  case. 


Sender 


Receiver 


mo,  m i 


b 


mb 


Figure  20.3.  A  l-out-of-2  OT 

We  present  a  scheme  which  allows  us  to  perform  a  l-out-of-2  oblivious  transfer  of  two  arbitrary 
bit  strings  mo,  mi  of  equal  length.  The  scheme  is  based  on  an  IND-CPA  version  of  DHIES,  where 
we  use  simply  the  exclusive-or  of  the  plaintext  with  a  hash  of  the  underlying  Difhe-Hellman  key  to 
encryt  the  payload. 

We  take  a  standard  discrete- logarithm-based  public/private  key  pair  (h  <—  gx,x),  where  g  is  a 
generator  of  cyclic  finite  abelian  group  G  of  prime  order  q.  We  will  require  a  hash  function  H  from 
G  to  bit  strings  of  length  n.  Then  to  encrypt  messages  m  of  length  n  we  compute,  for  a  random 
k  E  (Z/gZ), 

c  =  (ci,C2)  v-  (gk,m  ©  . 

To  decrypt  we  compute 

C2  0  H(cix)  =  m  0  H(gkx)  =  m  0  H(hk)  =  m. 

It  can  easily  be  shown  that  the  above  scheme  is  semantically  secure  under  chosen  plaintext  attacks 
(i.e.  passive  attacks)  in  the  random  oracle  model. 

The  idea  behind  our  oblivious  transfer  protocol  is  for  the  receiver  to  create  two  public  keys  ho 
and  hi,  for  only  one  of  which  does  he  know  the  corresponding  secret  key.  If  the  receiver  knows 
the  secret  key  for  hb,  where  b  is  the  bit  he  is  choosing,  then  he  can  decrypt  for  messages  encrypted 
under  this  key,  but  not  decrypt  under  the  other  key.  The  sender  then  only  needs  to  encrypt  his 
messages  with  the  two  keys.  Since  the  receiver  only  knows  one  secret  key  he  can  only  decrypt  one 
of  the  messages. 

To  implement  this  idea  concretely,  the  sender  first  selects  a  random  element  c  in  G;  it  is 
important  that  the  receiver  does  not  know  the  discrete  logarithm  of  c  with  respect  to  g.  This  value 
is  then  sent  to  the  receiver.  The  receiver  then  generates  two  public  keys,  according  to  his  bit  b,  by 
first  generating  x  E  (Z/gZ)  and  then  computing 

h&  4—  gx ,  hi-b  c/hb- 

Notice  that  the  receiver  knows  the  underlying  secret  key  for  hb,  but  he  does  not  know  the  secret 
key  for  hi_5  since  he  does  not  know  the  discrete  logarithm  of  c  with  respect  to  g.  These  two  public 
key  values  are  then  sent  to  the  sender.  The  sender  then  encrypts  message  mo  using  the  key  ho  and 
message  mi  using  key  hi,  i.e.  the  sender  computes 

c0  <-  (0,mo©ff(/iofe0))  , 

ci  «-  (gkl,m1  ©  H(hikl)j  , 

for  two  random  integers  ho,  k\  E  (Z/gZ).  These  two  ciphertexts  are  then  sent  to  the  receiver  who 
then  decrypts  the  bth  one  using  his  secret  key  x. 

From  the  above  description  we  can  obtain  some  simple  optimizations.  Firstly,  the  receiver  does 
not  need  to  send  both  ho  and  hi  to  the  sender,  since  the  sender  can  always  compute  hi  from  ho  by 
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computing  c/ho.  Secondly,  we  can  use  the  same  value  of  k  =  ko  =  k\  in  the  two  encryptions.  We 
thus  obtain  the  following  oblivious  transfer  protocol: 

Sender 
c^G 


fro 

h\  i —  cj ho 
k  V-  (Jj/qL) 
ci  V-  gk 
e0  m0  0  H(hok ) 

e\  <—  mi  ®  H(h\k)  C1^x 

mb  <r-  eb  0  H(ax). 

So  does  this  respect  the  two  conflicting  security  requirements  of  the  participants?  First,  note  that 
the  sender  cannot  determine  the  hidden  bit  b  of  the  receiver  since  the  value  ho  sent  from  the 
receiver  is  simply  a  random  element  in  G.  Then  we  note  that  the  receiver  can  learn  nothing  about 
mi_5  since  to  do  this  they  would  have  to  be  able  to  compute  the  output  of  H  on  the  value  hk_b , 
which  would  imply  contradicting  the  fact  that  H  acts  as  a  random  oracle  or  being  able  to  solve  the 
DifRe-Hellman  problem  in  the  group  G. 


Receiver 
x  V-  i/L/qL) 

h  <-  gx 

hi~b  c/hb 


Chapter  Summary 


•  We  introduced  the  idea  of  protocols  between  mutually  untrusting  parties,  and  introduced 
commitment  and  oblivious  transfer  as  two  simple  examples  of  such  protocols. 

•  A  commitment  scheme  allows  one  party  to  bind  themselves  to  a  value,  and  then  reveal  it 
later. 

•  A  commitment  scheme  needs  to  be  both  binding  and  concealing.  Efficient  schemes  exist 
which  are  either  information-theoretically  binding  or  information-theoretically  concealing, 
but  not  both. 

•  An  oblivious  transfer  protocol  allows  a  sender  to  send  one  of  two  messages  to  a  recipient, 
but  she  does  not  know  which  message  is  actually  received.  The  receiver  also  learns  nothing 
about  the  other  message  which  was  sent. 


Further  Reading 

The  above  oblivious  transfer  protocol  originally  appeared  in  a  slightly  modified  form  in  the  pa¬ 
per  by  Bellare  and  Micali.  The  paper  by  Naor  and  Pinkas  discusses  a  number  of  optimizations  of 
the  oblivious  transfer  protocol  which  we  presented  above.  In  particular  it  presents  mechanisms  to 
efficiently  perform  1-out-of-iV  oblivious  transfer.  The  papers  by  Blum  and  Shamir  et  al.  provide 
some  nice  early  ideas  related  to  commitment  schemes. 
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CHAPTER  21 


Zero-Knowledge  Proofs 


Chapter  Goals 


•  To  introduce  zero-knowledge  proofs. 

•  To  explain  the  notion  of  simulation. 

•  To  introduce  Sigma  protocols. 

•  To  explain  how  these  can  be  used  in  a  voting  protocol. 


21.1.  Showing  a  Graph  Isomorphism  in  Zero-Knowledge 

Suppose  Alice  has  a  password  and  wants  to  log  in  to  a  website  run  by  Bob,  but  she  does  not  quite 
trust  the  computer  Bob  is  using  to  verify  the  password.  If  she  just  sends  the  password  to  Bob  then 
Bob’s  computer  will  learn  the  whole  password.  To  get  around  this  problem  one  often  sees  websites 
that  ask  for  the  first,  fourth  and  tenth  letter  of  a  password  one  time,  and  then  maybe  the  first, 
second  and  fifth  the  second  time  and  so  on.  In  this  way  Bob’s  computer  only  learns  three  letters 
at  a  time.  So  the  password  can  be  checked  but  in  each  iteration  of  checking  only  three  letters  are 
leaked.  It  clearly  would  be  better  if  Bob  could  verify  that  Alice  has  the  password  in  such  a  way 
that  Alice  never  has  to  reveal  any  of  the  password  to  Bob.  This  is  the  problem  this  chapter  will 
try  to  solve. 

So  we  suppose  that  Alice  wants  to  convince  Bob  that  she  knows  something  without  Bob  finding 
out  exactly  what  Alice  knows.  This  apparently  contradictory  state  of  affairs  is  dealt  with  using 
zero-knowledge  proofs.  In  the  literature  of  zero-knowledge  proofs,  the  role  of  Alice  is  called  the 
prover,  since  she  wishes  to  prove  something,  whilst  the  role  of  Bob  is  called  the  verifier,  since  he 
wishes  to  verify  that  the  prover  actually  knows  something.  Often,  and  we  shall  also  follow  this 
convention,  the  prover  is  called  Peggy  and  the  verifier  is  called  Victor. 

The  classic  example  of  a  zero-knowledge  proof  is  based  on  the  graph  isomorphism  problem. 
Given  two  graphs  G\  and  G2 ,  with  the  same  number  of  vertices,  we  say  that  the  two  graphs  are 
isomorphic  if  there  is  a  relabelling  (i.e.  a  permutation)  of  the  vertices  of  one  graph  which  produces 
the  second  graph.  This  relabelling  (j)  is  called  a  graph  isomorphism,  which  is  denoted  by 


<fi  :  G\  — >  G2 . 


It  is  a  computationally  hard  problem  to  determine  a  graph  isomorphism  between  two  graphs.  As  a 
running  example  consider  the  two  graphs  in  Figure  21.1,  linked  by  the  permutation  <fi  =  (1,  2, 4,  3). 


Suppose  Peggy  knows  the  graph  isomorphism  0  between  two  public  graphs  G\  and  G 2,  so  we 
have  G2  =  4>(G  1).  We  call  (j)  the  prover ’s  private  input,  whilst  the  graphs  G\  and  G2  are  the  public 
or  common  input.  Peggy  wishes  to  convince  Victor  that  she  knows  the  graph  isomorphism,  without 
revealing  to  Victor  the  precise  nature  of  the  graph  isomorphism.  This  is  done  using  the  following 
zero-knowledge  proof. 
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2  4  Gi  2 - 4  G2 

<P  =  (1,2, 4, 3) 

1 - 3  5  1 - 3 - 5 

Figure  21.1.  Example  graph  isomorphism 


2  4  H 


1 - 3 -  5 

Figure  21.2.  Peggy’s  committed  graph 

•  Peggy  takes  the  graph  G2  and  applies  a  secret  random  permutation  ip  to  the  vertices  of 
G2  to  produce  another  isomorphic  graph  H  <—  p(G2).  In  our  running  example  we  take 
ip  =  (1,2);  the  isomorphic  graph  H  is  then  given  by  Figure  21.2. 

•  Peggy  now  publishes  H  as  a  commitment.  She  of  course  knows  the  following  secret 
graph  isomorphisms 


0 

Gi  - 

-^G2 

Ip 

G2  - 

-+H, 

pop 

Gi  - 

H. 

•  Victor  now  gives  Peggy  a  challenge.  He  selects1  b 
isomorphism  between  G 5  and  H 

•  Peggy  now  gives  her  response  by  returning  either  y  = 
value  of  b. 

•  Victor  now  verifies  whether  x(Gb)  —  H- 
The  transcript  of  the  protocol  then  looks  like 


P 

V  :  H 

V 

— >  P  :  b. 

P 

— >  v  :  X- 

In  our  example  if  Victor  chooses  b  =  2  then  Peggy  simply  needs  to  publish  ip.  However,  if  Victor 
chooses  b  =  1  then  Peggy  publishes 

tjj  o  <j>  =  (1, 2)  o  (1, 2, 4, 3)  =  (2, 4, 3). 

We  can  then  see  that  (2, 4,  3)  is  the  permutation  which  maps  graph  G\  onto  graph  H.  But  to 
compute  this  we  needed  to  know  the  hidden  isomorphism  <p.  Thus  when  b  =  2  Victor  is  checking 
whether  Peggy  is  honest  in  her  commitment,  whilst  if  b  =  1  he  is  checking  whether  Peggy  is  honest 
in  her  claim  to  know  the  isomorphism  from  G\  to  G2. 

If  Peggy  does  not  know  the  graph  isomorphism  p  then,  before  Victor  gives  his  challenge,  she 
will  need  to  know  the  graph  Gb  which  Victor  is  going  to  pick.  Hence,  if  Peggy  is  cheating  she  will 

1  Since  it  is  seleted  by  Victor  we  denote  the  value  b  in  blue. 


G  {1,2}  and  asks  for  the  graph 
ip  or  x  —  P  o  0,  depending  on  the 
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only  be  able  to  respond  to  correctly  to  Victor  fifty  percent  of  the  time.  So,  repeating  the  above 
protocol  a  number  of  times,  a  non-cheating  Peggy  will  be  able  to  convince  Victor  that  she  really 
does  know  the  graph  isomorphism,  with  a  small  probability  that  a  cheating  Peggy  will  be  able  to 
convince  Victor  incorrectly. 

Now  we  need  to  determine  whether  Victor  learns  anything  from  running  the  protocol,  i.e.  is 
Peggy’s  proof  really  zero-knowledge?  We  first  notice  that  Peggy  needs  to  produce  a  different  value 
of  H  on  every  run  of  the  protocol,  otherwise  Victor  can  trivially  cheat.  We  assume  therefore  that 
this  does  not  happen. 

One  way  to  see  whether  Victor  has  learnt  something  after  running  the  protocol  is  to  look  at  the 
transcript  of  the  protocol  and  ask  after  having  seen  the  transcript  whether  Victor  has  gained  any 
knowledge,  or  for  that  matter  whether  anyone  looking  at  the  protocol  but  not  interacting  learns 
anything.  One  way  to  see  that  Victor  has  not  learnt  anything  is  to  see  that  Victor  could  have  written 
down  a  valid  protocol  transcript  without  interacting  with  Peggy  at  all.  Hence,  Victor  cannot  use 
the  protocol  transcript  to  convince  someone  else  that  he  knows  Peggy’s  secret  isomorphism.  He 
cannot  even  use  the  protocol  transcript  to  convince  another  party  that  Peggy  knows  the  secret 
graph  isomorphism  <fi. 

Victor  can  produce  a  valid  protocol  transcript  using  the  following  simulation : 

•  b<r-  {1,2}. 

•  Generate  a  random  isomorphism  y  of  the  graph  Gb  to  produce  the  graph  H. 

•  Output  the  transcript 

P  — >  V  :  H, 

V  — >  P  :  6, 

P  — >  V  :  X- 

Hence,  the  interactive  nature  of  the  protocol  is  what  provides  the  “proof”  in  the  zero-knowledge 
proof.  We  remark  that  the  three-pass  system  of 

commitment  — >  challenge  — >  response 

is  the  usual  characteristic  of  such  protocols  when  deployed  in  practice.  Notice  how  this  is  similar 
to  the  signature  schemes  we  discussed  in  Section  16.5.4.  This  is  not  coincidental,  as  we  shall  point 
out  below. 

Clearly  two  basic  properties  of  an  interactive  proof  system  are 

•  Completeness:  If  Peggy  really  knows  the  thing  being  proved  and  follows  the  protocol, 
then  Victor  should  accept  her  proof  with  probability  one. 

•  Soundness:  If  Peggy  does  not  know  the  thing  being  proved,  then  whatever  she  does, 
Victor  should  only  have  a  small  probability  of  actually  accepting  the  proof. 

Just  as  with  commitment  schemes  we  can  divide  zero-knowledge  protocols  into  categories  depending 
on  whether  they  are  secure  with  respect  to  computationally  bounded  or  unbounded  adversaries. 
We  usually  assume  that  Victor  is  a  polynomially  bounded  party,  whilst  Peggy  is  unbounded2.  In 
the  above  protocol  based  on  graph  isomorphism  we  saw  that  the  soundness  probability  was  equal  to 
one  half.  Hence,  we  needed  to  repeat  the  protocol  a  number  of  times  to  improve  this  to  something 
close  to  one. 

The  zero-knowledge  property  we  have  already  noted  is  related  to  the  concept  of  a  simulation. 
Suppose  the  set  of  valid  transcripts  (produced  by  true  protocol  runs)  is  denoted  by  V,  and  let  the 
set  of  possible  simulations  be  denoted  by  S.  The  security  is  therefore  related  to  how  much  the 
set  V  is  like  the  set  S.  A  zero-knowledge  proof  is  said  to  have  perfect  zero- knowledge  if  the  two 

9 

Although  in  many  of  our  examples  the  existence  of  a  witness  is  certain  and  hence  we  might  as  well  assume  that 
Peggy  knows  the  witness  already  and  is  bounded. 


428 


21.  ZERO-KNOWLEDGE  PROOFS 


sets  V  and  S  are  essentially  identical,  in  which  case  we  write  V  =  S.  If  the  two  sets  have  “small” 
statistical  distance3,  but  cannot  otherwise  be  distinguished  by  an  all-powerful  adversary,  we  say  we 
have  statistical  zero-knowledge,  and  we  write  V  S.  If  the  two  sets  are  only  indistinguishable 
by  a  computationally  bounded  adversary  we  say  that  the  zero-knowledge  proof  has  computational 
zero-knowledge,  and  we  write  V  ~c  S. 


21.2.  Zero-Knowledge  and  AfV 

So  the  question  arises  as  to  what  can  be  shown  in  zero-knowledge.  Above  we  showed  that  the 
knowledge  of  whether  two  graphs  are  isomorphic  can  be  shown  in  zero-knowledge.  Thus  the  decision 
problem  of  Graph  Isomorphism  lies  in  the  set  of  all  decision  problems  which  can  be  proven  in 
zero-knowledge.  But  Graph  Isomorphism  is  believed  to  he  between  the  complexity  classes  V  and 
ACP-complete,  i.e.  it  can  neither  be  solved  in  polynomial  time,  nor  is  it  WP-complete. 

We  can  think  of  AfV  problems  as  those  problems  for  which  there  is  a  witness  (or  proof)  which 
can  be  produced  by  an  all-powerful  prover,  but  for  which  a  polynomially  bounded  verifier  is  able  to 
verify  the  proof.  However,  for  the  class  of  AfV  problems  the  prover  and  the  verifier  do  not  interact, 
i.e.  the  proof  is  produced  and  then  the  verifier  verifies  it. 

If  we  allow  interation  then  something  quite  amazing  happens.  Consider  an  all  powerful  prover 
who  interacts  with  a  polynomially  bounded  verifier.  We  wish  the  prover  to  convince  the  verifier  of 
the  validity  of  some  statement.  This  is  exactly  what  we  had  in  the  previous  section  except  that  we 
only  require  the  completeness  and  soundness  properties,  i.e.  we  do  not  require  the  zero-knowledge 
property.  The  decision  problems  which  can  be  proven  to  be  true  in  such  a  manner  form  the 
complexity  class  of  interactive  proofs ,  or  XV.  It  can  be  shown  that  the  complexity  class  XV  is  equal 
to  the  complexity  class  VSVACS,  i.e.  the  set  of  all  decision  problems  which  can  be  solved  using 
polynomial  space.  It  is  widely  believed  that  AfVffVSVACS ,  which  implies  that  having  interaction 
really  gives  us  something  extra. 

So  what  happens  to  interactive  proofs  when  we  add  the  zero-knowledge  requirement?  We  can 
define  a  complexity  class  CZJC  of  all  decision  problems  whose  solutions  can  be  verified  using  a  com¬ 
putational  zero-knowledge  proof.  We  have  already  shown  that  the  problem  of  Graph  Isomorphism 
lies  in  CZJC,  but  this  might  not  include  all  of  the  J\fV  problems.  However,  since  3-colourability  is 
WP-complete,  we  have  the  following  elegant  proof  that  AfV  C  CZJC , 

Theorem  21.1.  The  problem  of  3-colourability  of  a  graph  lies  in  CZJC,  assuming  a  computationally 
concealing  commitment  scheme  exists. 


Proof.  Consider  a  graph  G  =  (V,E)  in  which  the  prover  knows  a  colouring  ip  of  G ,  i.e.  a  map 
fj  :  V  {1,2,3}  such  that  ip(vi)  7^  if  (02)  if  (^1,^2)  £  E.  The  prover  first  selects  a  commitment 
scheme  C(x,r)  and  a  random  permutation  tt  of  the  set  {1,2,3}.  Note  that  the  function  Tr{if{v)) 
defines  another  three-colouring  of  the  graph.  Now  the  prover  commits  to  this  second  three-colouring 
by  sending  to  the  verifier  the  commitments 

Ci  =  C  (7 r(</>G))>  ri )  f°r  all  vi  e  v- 

The  verifier  then  selects  a  random  edge  (vi,vj)  G  E  and  sends  this  to  the  prover.  The  prover  now 
decommits  to  the  commitment,  by  returning  the  values  of 


tt (ip(vi))  and  7T (tp(vj)), 


and  the  verifier  checks  that 


RtHO  r  RV’Ej))- 

We  now  turn  to  the  three  required  properties  of  a  zero-knowledge  proof. 


3See  Chapter  7. 
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Completeness:  The  above  protocol  is  complete  since  any  valid  prover  will  get  the  verifier  to 
accept  with  probability  one. 

Soundness:  If  we  have  a  cheating  prover,  then  at  least  one  edge  is  invalid,  and  with  probability 
at  least  1/\E\  the  verifier  will  select  an  invalid  edge.  Thus  with  probability  at  most  1  —  l/\E\  a 
cheating  prover  will  get  a  verifier  to  accept.  By  repeating  the  above  proof  many  times  one  can 
reduce  this  probability  to  as  low  a  value  as  we  require. 

Zero-Knowledge:  Assuming  the  commitment  scheme  is  computationally  concealing,  the  obvious 
simulation  and  the  real  protocol  will  be  computationally  indistinguishable.  □ 


Notice  that  this  is  a  very  powerful  result.  It  says  that  virtually  any  statement  which  is  likely  to 
come  up  in  cryptography  can  be  proved  in  zero-knowledge.  Clearly  the  above  proof  would  not 
provide  a  practical  implementation,  but  at  least  we  know  that  very  powerful  tools  can  be  applied. 
In  the  next  section  we  turn  to  proofs  that  can  be  applied  in  practice.  But  before  doing  that  we 
note  that  the  above  result  can  be  extended  even  further. 

Theorem  21.2.  If  one-way  functions  exist  then  CZ1C  =  TV,  and  hence  CZ1C  =  VSVACS. 


21.3.  Sigma  Protocols 

One  can  use  a  zero-knowledge  proof  of  possession  of  some  secret  as  an  identification  scheme.  The 
secret  in  the  identification  scheme  will  be  the  hidden  information,  or  witness,  e.g.  the  graph  iso¬ 
morphism  in  our  previous  example.  Then  we  use  the  zero-knowledge  protocol  to  prove  that  the 
person  knows  the  isomorphism,  without  revealing  anything  about  it.  The  trouble  with  the  above 
protocol  for  graph  isomorphisms  is  that  it  is  not  very  practical.  The  data  structures  required  are 
very  large,  and  the  protocol  needs  to  be  repeated  a  large  number  of  times  before  Victor  is  convinced 
that  Peggy  really  knows  the  secret. 

This  is  exactly  what  a  Sigma  protocol  provides.  It  is  a  three-move  protocol:  the  prover  goes  first 
(in  the  commitment  phase),  then  the  verifier  responds  (with  the  challenge),  and  finally  the  prover 
provides  the  final  response;  the  verifier  is  then  able  to  verify  the  proof.  This  is  exactly  like  our 
graph  isomorphism  proof  earlier.  But  for  Sigma  protocols  we  make  some  simplifying  assumptions; 
in  particular  we  assume  that  the  verifier  is  honest  (in  that  he  will  always  follow  the  protocol 
correctly). 

21.3.1.  Schnorr’s  Identification  Protocol:  In  essence  we  have  already  seen  a  Sigma  protocol 
which  has  better  bandwidth  and  error  properties  when  we  discussed  Schnorr  signatures  in  Chap¬ 
ter  16.  Suppose  Peggy’s  secret  is  now  the  discrete  logarithm  x  of  y  with  respect  to  g  in  some 
finite  abelian  group  G  of  prime  order  q.  To  create  an  identification  protocol,  we  want  to  show  in 
zero-knowledge  that  Peggy  knows  the  value  of  x.  The  protocol  for  proof  of  knowledge  now  goes  as 
follows 


P  — >  V  :  r  <—  gk  for  a  random  k  <—  Z/gZ, 

V  — »  P  :  e  <—  TijqL, 

P  — >  V  :  s  <—  k  +  x  •  e  (mod  q). 

Victor  now  verifies  that  Peggy  knows  the  secret  discrete  logarithm  x  by  verifying  that  r  =  gs  •  y-e. 
Let  us  examine  this  protocol  in  more  detail. 

Completeness:  We  first  note  that  the  protocol  is  complete,  in  that  if  Peggy  actually  knows  the 
discrete  logarithm  then  Victor  will  accept  the  protocol  since 

9s  • y~e  = 


gk+X-e  .  {gX)-e  =  gk  =  r. 
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Soundness:  If  Peggy  does  not  know  the  discrete  logarithm  x  then  one  can  informally  argue  that 
she  will  only  be  able  to  cheat  with  probability  1  /q,  which  is  much  better  than  the  1/2  from  the 
earlier  graph-isomorphism-based  protocol.  We  can  however  show  that  the  protocol  has  something 
called  the  special  soundness  property. 

Definition  21.3  (Special  Soundness).  Suppose  that  we  have  two  protocol  runs  with  transcripts 
(r,  e,  s)  and  (r,e7,s  ).  Note  that  the  commitments  are  equal  but  that  the  challenges  (and  hence 
responses)  are  different.  A  protocol  is  said  to  have  the  special  soundness  property  if  given  two  such 
transcripts  we  can  recover  x. 

As  an  example,  for  our  Schnorr  protocol  above,  we  have  that  given  two  such  verifying  transcripts 
we  have  that 

S  — P  G  — 

r  =  g  •  y  =  g  •  y  =  r. 


This  implies  in  turn  that 


s  +  x  •  (— e)  =  s'  +  x  •  (— e)  (mod  q). 


Hence,  we  recover  x  via 


x 


(mod  q ). 


Notice  that  this  proof  of  soundness  is  almost  exactly  the  same  as  our  use  of  the  forking  lemma 
to  show  that  Schnorr  signatures  are  EUF-CMA  secure  assuming  discrete  logarithms  are  hard.  The 
above  algorithm,  which  takes  (r,  e,  s)  and  (reefs')  and  outputs  the  discrete  logarithm  x,  is  a 
knowledge  extractor  as  defined  below.  It  is  the  existence  of  this  algorithm  which  shows  we  have  a 
zero-knowledge  proof  of  knowledge,  as  defined  below,  and  not  just  a  zero-knowledge  proof. 

More  formally,  suppose  we  have  a  statement,  say  X  G  £,  where  C  is  some  language  in  AfV . 
Since  the  language  lies  in  AfV  we  know  there  exists  a  witness  w.  Now  a  zero- knowledge  proof  is 
an  interactive  protocol  which  given  a  statement  X  G  C  will  allow  an  infinitely  powerful  prover  to 
demonstrate  that  Note  here  that  the  prover  may  not  actually  “know”  the  witness.  However, 

a  protocol  is  said  to  be  a  zero-knowledge  proof  of  knowledge  if  it  is  a  zero- knowledge  proof  and 
there  exists  an  algorithm,  called  a  knowledge  extractor  E,  which  can  use  a  valid  prover  to  output 
the  witness  w. 

In  our  above  example  we  take  the  prover  and  run  her  once,  then  rewind  her  to  the  point  when 
she  asks  for  the  verifier’s  challenge,  we  then  supply  her  with  another  challenge  and  thus  end  up 
obtaining  the  two  tuples  (r,e,s)  and  ( r,e',s Then  the  special  soundness  implies  that  there  is 
a  knowledge  extractor  E  which  takes  as  input  (r,e,s)  and  (r,e',s')  and  outputs  the  witness.  We 
write  x  <—  E((r,e,s),(r,e',s')). 


Zero-Knowledge:  But  does  Victor  learn  anything  from  the  protocol?  The  answer  to  this  is  no, 
since  Victor  could  simulate  the  whole  transcript  in  the  following  way. 

•  e  <—  TLfqTL. 

•  r  <—  gs  •  y~e. 

•  Output  the  transcript 


p 

— >  V  :  r, 

V 

— »  P  :  e, 

p 

~^V:s. 

In  other  words  the  protocol  is  zero- knowledge,  in  that  someone  cannot  tell  the  simulation  of  a 
transcript  from  a  real  transcript.  This  is  exactly  the  same  simulation  we  used  when  simulating  the 
signing  queries  in  our  proof  of  security  of  Schnorr  signatures  in  Theorem  16.12. 
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21.3.2.  Formalizing  Sigma  Protocols:  Before  we  discuss  other  Sigma  protocols,  we  introduce 
some  notation  to  aid  our  discussion.  We  assume  we  have  some  statement  X  £  £,  for  some  MV 
language  £,  and  we  want  to  prove  that  the  prover  “knows”  the  underlying  witness  w. 

Suppose  we  wish  to  prove  knowledge  of  the  variable  x  via  a  Sigma  protocol,  then 

•  R(w ,  k )  denotes  the  algorithm  used  to  compute  the  commitment  r,  where  k  is  the  random 
value  used  to  produce  the  commitment. 

•  e  is  the  challenge  from  a  set  E. 

•  S(e,w,k)  denotes  the  algorithm  which  the  prover  uses  to  compute  her  response  s  £  § 
given  e. 

•  V (r,  e,  s )  denotes  the  verification  algorithm. 

•  S'(e,  s )  denotes  the  simulator’s  algorithm  which  creates  a  value  of  a  commitment  r  which 
will  verify  the  transcript  (r,  e,  s),  for  random  input  values  e  £  E  and  5. 

•  E  ((r,  e,  s),  (r,  e7,  s'))  is  the  knowledge  extractor  which  will  output  w. 

All  algorithms  are  assumed  to  implicitly  have  as  input  the  public  statement  X  for  which  w  is  a 
witness.  Using  this  notation  Schnorr’s  identification  protocol  becomes  the  following.  The  statement 
X  is  that 

3x  such  that  y  =  gx , 


and  the  witness  is  w  —  x.  We  then  have 


R(x,  k ) 

:=  r  gk, 

5(e,  x,  k) 

:=  5  k  +  e  •  x  (mod  g), 

V(r,e,s ) 

:=  true  if  and  only  if  (r  =  gs 

S'(e:  s ) 

:=  r  <r-  gs  ■  y~e, 

(r,e',s')) 

:=  x  i - -  (mod  q). 

p  —  p' 

y~e ) 


21.3.3.  Associated  Identification  Protocol:  We  can  create  an  identification  protocol  from  any 
Sigma  protocol  as  follows:  We  have  some  statement  X  which  is  bound  to  an  entity  P  such  that  P 
has  been  given  the  witness  w  for  the  statement  being  valid.  To  prove  that  P  really  does  know  w 
without  revealing  anything  about  re,  we  execute  the  following  protocol: 

P  — »  V  :  r  <—  R(w ,  fc), 

V  — »  P  :  e  <—  E, 

P  — »  V  :  s  <—  S(e,w,k), 

where  the  verifier  accepts  the  claimed  identity  if  and  only  if  V (r,  e,  s )  returns  true.  One  can  think 
of  w  in  this  protocol  as  a  very  strong  form  of  “password”  authentication,  where  no  information 
about  the  “password”  is  leaked.  Of  course  this  will  only  be  secure  if  finding  the  witness  given  only 
the  statement  is  a  hard  problem,  since  otherwise  anyone  could  compute  the  witness  on  their  own. 


21.3.4.  Associated  Signature  Schemes:  We  have  already  seen  how  the  Schnorr  signature 
scheme  and  Schnorr’s  identification  protocol  are  related.  It  turns  out  that  any  Sigma  protocol 
with  the  special  soundness  property  can  be  turned  into  a  digital  signature  scheme.  The  method 
of  transformation  is  called  the  Fiat-Shamir  heuristic,  since  it  only  produces  a  heuristically  secure 
scheme  as  the  security  proof  requires  the  use  of  the  random  oracle  H  whose  codomain  is  the  set  E. 

•  Key  Generation:  The  public  key  is  the  statement  X  £  £,  and  the  secret  key  is  the 
witness  w. 
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•  Signing:  The  signature  on  a  message  m  is  generated  by 

r  <—  R(w,  k ), 
e  <—  H(r\\m), 
s  <—  5(e,  re,  k). 

Output  (e,  s)  as  the  signature. 

•  Verification:  Generate  r'  <—  S'(e,s),  and  then  check  whether  e  =  H(r'\\m). 

Using  the  same  technique  as  in  the  proof  of  Theorem  16.12  we  can  prove  the  following. 

Theorem  21.4.  In  the  random  oracle  model  let  A  denote  an  EUF-CMA  adversary  against  the 
above  signature  scheme  with  advantage  e,  making  qu  queries  to  its  hash  function  H.  Then  there  is 
an  adversary  B  against  the  associated  Sigma  protocol  which  given  X  G  C  can  extract  the  witness  w 
with  advantage  e'  such  that 


QH  QH 

Proof.  The  proof  is  virtually  identical  to  that  of  Theorem  16.12.  We  take  the  adversary  A  and 
wrap  it  inside  another  algorithm  A'  which  does  not  make  queries  to  a  signature  oracle.  This  is  done 
using  the  simulator  S'  for  the  Sigma  protocol.  We  then  apply  the  forking  lemma  to  A'  in  order  to 
construct  an  algorithm  B  which  will  output  a  pair  of  tuples  (r,e,  s)  and  (r,e',s').  We  then  pass 
these  tuples  to  the  knowledge  extractor  E  in  order  to  recover  the  witness  w.  □ 


21.3.5.  Non-interactive  Proofs:  Sometimes  we  want  to  prove  something  in  zero-knowledge  but 
not  have  the  interactive  nature  of  a  protocol.  For  example  one  entity  may  be  sending  some  encrypted 
data  to  another  entity,  but  wants  to  prove  to  anyone  seeing  the  ciphertext  that  it  encrypts  a  value 
from  a  given  subset.  If  a  statement  can  be  proved  with  a  Sigma  protocol  we  can  turn  it  into  a  non¬ 
interactive  proof  by  replacing  the  verifier’s  challenge  component  with  a  hash  of  the  commitment 
and  the  statement.  This  last  point  is  often  forgotten,  since  it  is  not  needed  in  the  signature  example 
above,  and  hence  people  forget  about  it  when  producing  general  non-interactive  proofs.  Hence,  in 
the  Schnorr  proof  of  knowledge  of  discrete  logarithms  protocol  we  would  define  the  challenge  as 

e  <—  H{r 


21.3.6.  Chaum- Pedersen  Protocol:  We  now  present  a  Sigma  protocol  called  the  Chaum- 
Pedersen  protocol  which  was  first  presented  in  the  context  of  electronic  cash  systems,  but  which 
has  very  wide  application.  Suppose  Peggy  wishes  to  prove  she  knows  two  discrete  logarithms 


yi  =  gxi  and  y2  =  hx 2 

such  that  x\  —  X2,  i.e.  we  wish  to  present  not  only  a  proof  of  knowledge  of  the  discrete  logarithms, 
but  also  a  proof  of  equality  of  the  hidden  discrete  logarithms.  We  assume  that  g  and  h  generate 
groups  of  prime  order  q,  and  we  denote  the  common  discrete  logarithm  by  x  to  ease  notation.  Using 
our  prior  notation  for  Sigma  protocols,  the  Chaum-Pedersen  protocol  can  be  expressed  via 


R(x,  k) 
S(e,  x,  k ) 
V{{n,r2),e,s) 
S'(e,  s ) 


(n,r2)  <-  ( gk,hk ), 

5  <—  k  —  e  •  x  (mod  q), 

true  if  and  only  if  (rq  =  gb  •  y\  and  r 2  =  hs  •  y%) , 
(n,r2)  <-  ( gs  ■  yl,  hs  -y2), 


E((r,  e,  5),  (r,  e7,  s'))  :=  xi - -  (mod  q). 

e  —  e 

Note  how  this  resembles  two  concurrent  runs  of  the  Schnorr  protocol,  but  with  a  single  challenge 
value.  The  Chaum-Pedersen  protocol  is  clearly  both  complete  and  has  the  zero- knowledge  property, 
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the  second  fact  follows  since  the  simulation  S'(e,  s)  produces  transcripts  which  are  indistinguishable 
from  a  real  transcript. 

We  show  it  is  sound,  by  showing  it  has  the  special  soundness  property.  Hence,  we  assume 
two  protocol  runs  with  the  same  commitments  (ri,r2),  but  with  different  challenges  e  and  e7,  and 
corresponding  valid  responses  5  and  s' .  With  this  data  we  need  to  show  that  this  reveals  the 
common  discrete  logarithm  via  the  extractor  E ,  and  that  the  discrete  logarithm  is  indeed  common. 
Since  the  two  transcripts  pass  the  verification  test  we  have  that 


But  this  implies  that 


=  gs  ■  vl 


9 


s  —s 


=  (f  •  y\  and  V2  — 
and  i/2e~e'  =  hs'~s , 


hs  •  ye2 

and  so 


(e  -  e')  •  dlog9(yi)  =  s'  -  s  and  (e  -  e')  •  dlog h{y2)  =  s'  -  s. 


Hence,  the  two  discrete  logarithms  are  equal  and  can  be  extracted  from 


x 


(mod  q). 


21.3.7.  Proving  Knowledge  of  Pedersen  Commitments:  Often  one  commits  to  a  value  using 
a  commitment  scheme,  but  the  receiver  is  not  willing  to  proceed  unless  one  proves  one  knows  the 
value  committed  to.  In  other  words  the  receiver  will  only  proceed  if  he  knows  that  the  sender  will 
at  some  point  be  able  to  reveal  the  value  committed  to.  For  the  commitment  scheme 

B{x)  =  gx 

this  is  simple,  we  simply  execute  Schnorr’s  protocol  for  proof  of  knowledge  of  a  discrete  logarithm. 
For  Pedersen  commitments  Ba(x)  =  hx  •  ga  we  need  something  different.  In  essence  we  wish  to 
prove  knowledge  of  x\  and  X2  such  that 

V  =  9iXl  •  92X2 

where  g\  and  g2  are  elements  in  a  group  of  prime  order  q.  We  note  that  the  following  protocol 
generalizes  easily  to  the  case  when  we  have  more  bases,  i.e. 

y  =  9iXl  *  * -9nXn, 


a  generalization  that  we  leave  as  an  exercise.  In  terms  of  our  standard  notation  for  Sigma  protocols 
we  have 


r  gikl  ■  gk<2 


—  (sii  s2)  (^1  +  e  ■  x\  (mod  q ),  k,2  +  e  •  X2  (mod  q )) 

=  true  if  and  only  if  ( g j*1  •  g^2  =  ye  ■  r) , 


R(x,(ki,k2)) 

S(e,(xi,x2),(ki,k2)) 

V(r,e,(s1,s2)) 

S"(e,(si,s2)) 

We  leave  it  to  the  reader  to  verify  that  this  protocol  is  complete  and  zero-knowledge,  and  to  work 
out  the  knowledge  extractor. 


r^gsi  '92  -y  e. 


21.3.8.  “Or”  Proofs:  Sometimes  the  statement  about  which  we  wish  to  execute  a  Sigma  protocol 
is  not  as  clear  cut  as  the  previous  examples.  For  example,  suppose  we  wish  to  show  we  know  either 
a  secret  x  or  a  secret  y,  without  revealing  which  of  the  two  secrets  we  know.  This  is  a  very 
common  occurrence  which  arises  in  a  number  of  advanced  protocols,  including  the  voting  protocol 
we  consider  later  in  this  chapter.  It  turns  out  that  to  show  knowledge  of  one  thing  or  another  can 
be  performed  using  an  elegant  protocol  due  to  Cramer,  Damgard  and  Schoenmakers. 

First  assume  that  there  already  exist  Sigma  protocols  to  prove  knowledge  of  both  secrets  in¬ 
dividually.  We  will  combine  these  two  Sigma  protocols  together  into  one  protocol  which  proves 
the  statement  we  require.  The  key  idea  is  as  follows:  For  the  secret  we  know,  we  run  the  Sigma 
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protocol  as  normal,  however,  for  the  secret  we  do  not  know,  we  run  the  simulated  Sigma  protocol. 
These  two  protocols  are  then  linked  together  by  linking  the  commitments. 

As  a  high-level  example,  suppose  the  Sigma  protocol  for  proving  knowledge  of  x  is  given  by  the 
set  of  algorithms 

R\{x,  k\),  S\(e\,x,  h),Vi(n,  ei,  si),  S[(ei,  si),  Ei({ri,  ei,  si),  (n,  e[,  s[)). 


Similarly  we  let  the  Sigma  protocol  to  prove  knowledge  of  y  be  given  by 

R2(y,  k2),  S2(e2,  y,  k2),  V2(r2,  e2,  s2),  S2(e2,  s2),  E2((r2,  e2,  s2),  (r2,  e2,  s'2))- 

We  assume  in  what  follows  that  the  challenges  e\  and  e2  are  bit  strings  of  the  same  length.  What 
is  important  is  that  they  come  from  the  same  set  E,  and  can  be  combined  in  a  one-time-pad-like 
manner.4 

Now  suppose  we  know  x,  but  not  7/,  then  our  algorithms  for  the  combined  proof  become: 


i?(x,  k\) 


A(e,  x,  k\) 


V((r1,r2),e,(e1,e2,s1,s2)) 


S  (6,  (6^ ,  62 , 5l , S2) ) 


ri  R\(x,  k\) 


(n,r2)  •«-  < 


e2  <-  E 
52  ^ —  ^2 

r2  <-  S2(e2,s2), 


(ei, e2, si, s2)  <r- 


e\  <—  e  ®  e2 
si  <-  S\{e\,x,  k\), 


true  if  and  only  if 

e  =  e\  ©  e2 
and  Vi(ri,  ei,  s\) 
and  V2(r2,e2,s2), 
(n,r2)  <-  (S[{ei,si),S'2{e2,s2).) 


Note  that  the  prover  does  not  reveal  the  value  of  ei,e2  or  52  until  the  response  stage  of  the 
protocol.  Also  note  that  in  the  simulated  protocol  the  correct  distributions  of  e,  e\  and  e2  are  such 
that  e  =  e\  ©  e2.  The  protocol  for  the  case  where  we  know  y  but  not  x  follows  by  reversing  the  roles 
of  e\  and  e2,  and  r\  and  r2  in  the  algorithms  R ,  S  and  V.  If  the  prover  knows  both  x  and  y  then 
they  can  execute  either  of  the  two  possibilties.  The  completeness,  soundness  and  zero-knowledge 
properties  follow  from  the  corresponding  properties  of  the  original  Sigma  protocols. 

These  “Or”  proofs  can  be  extended  to  an  arbitrary  number  of  disjunctions  of  statements  in  the 
obvious  manner:  Given  n  statements  of  which  the  prover  only  knows  one  secret, 

•  Simulate  n  —  1  statements  using  the  simulations  and  challenges 

•  Commit  as  usual  to  the  known  statement 

•  Generate  a  correct  challenge  for  the  known  statement  via 


e  —  ei  ®  •  •  •  ®  en. 


Example  1:  We  now  present  a  simple  example  which  uses  the  Schnorr  protocol  as  a  building 
block.  Suppose  we  wish  to  prove  knowledge  of  either  x\  or  x2  such  that  y\  —  gxi  and  2/2  =  9X2 , 
where  g  lies  in  a  group  G  of  prime  order  q.  We  assume  that  the  prover  knows  X{  but  not  Xj  where 
i  4  3- 

The  prover’s  commitment,  (r  1,7*2),  is  computed  by  selecting  ej  and  k{  uniformly  at  random 
from  E  *  and  Sj  uniformly  at  random  from  G.  They  then  compute  gki  and  ry  gsi  •  y-  6j . 

4So  for  example  we  could  use  addition  if  they  came  from  a  finite  field. 
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On  receiving  the  challenge  e  E  F  *  the  prover  computes 

e\  e  —  ej  (mod  q) 

Si  <-  ki  +  ei-Xi  (mod  q). 


Note  that  we  have  replaced  0  in  computing  the  “challenge”  ei  with  addition  modulo  q\  a  moment’s 
thought  reveals  that  this  is  a  better  way  to  preserve  the  relative  distributions  in  this  example  since 
arithmetic  associated  with  the  challenge  is  performed  in  ¥q.  The  prover  then  outputs  (ei,  e2,  si,  s2). 
The  veriher  checks  the  proof  by  checking  that  e  =  e\  +  e2  (mod  q)  and  r\  —  gSl  •  y^ei  and 

ri  =  gS2  ■  V2  62  ■ 


Example  2:  We  end  this  section  by  giving  a  protocol  which  will  be  required  when  we  discuss  voting 
schemes.  It  is  obtained  by  combining  the  protocol  for  proving  knowledge  of  Pedersen  commitments 
with  “Or”  proofs.  Consider  the  earlier  commitment  scheme  given  by 


Ba(x)  hxga, 


where  G  =  ( g )  is  a  finite  abelian  group  of  prime  order  g,  h  is  an  element  of  G  whose  discrete 
logarithm  with  respect  to  g  is  unknown,  x  is  the  value  being  committed  to,  and  a  is  a  random 
value.  We  are  interested  in  the  case  where  the  value  committed  to  is  restricted  to  be  either  plus 
or  minus  one,  i.e.  x  E  {  —  1, 1}.  It  will  be  important  in  our  application  for  the  person  committing 
to  prove  that  their  commitment  is  from  the  set  {  —  1,1}  without  revealing  what  the  actual  value  of 
the  committed  value  is.  To  do  this  we  execute  the  following  protocol. 

•  As  well  as  publishing  the  commitment  Ba(x),  Peggy  also  chooses  random  numbers  d,  r 
and  w  modulo  q  and  then  publishes  aq  and  a2  where 


OL\  i — 


g2  E- 


gr  ■  (Ba(x)  ■  h)~d 

gw 

gw 

gr  •  (Ba(x)  •  h -1) 


if  x  =  1 

if  x  =  —  1, 


d 


if  x 
if  x 


1 

-1. 


•  Victor  now  sends  a  random  challenge  e  to  Peggy. 

•  Peggy  responds  by  setting 


d  i —  e  —  d, 
r'  w  +  a  •  d! . 


Then  Peggy  returns  the  values 


(ei,e2,ri,r2)  E- 


(d,  d7,  r,  r') 
(d7,  d,  r7,  r) 


if  x 
if  x 


1 

-1. 


•  Victor  then  verifies  that  the  following  three  equations  hold; 


e  =  e  i  +  e2, 

gri  =  ai  ■  ( Ba{x )  •  h)e\ 
gr2  =  «2  •  (Ba(x)  ■  /i-1)62. 

To  show  that  the  above  protocol  works  we  need  to  show  that 

(1)  If  Peggy  responds  honestly  then  Victor  will  verify  that  the  above  three  equations  hold. 

(2)  If  Peggy  has  not  committed  to  plus  or  minus  one  then  she  will  find  it  hard  to  produce  a 
response  to  Victor’s  challenge  which  is  correct. 

(3)  The  protocol  reveals  no  information  to  any  party  as  to  the  exact  value  of  Peggy’s  commit¬ 
ment,  bar  that  it  comes  from  the  set  {  —  1, 1}. 
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We  leave  the  verification  of  these  three  points  to  the  reader, 
clearly  be  conducted  in  a  non-interactive  manner  by  defining  e 


Note  that  the  above  protocol  can 
=  H  (ai||a2||.Ba(x))- 


21.4.  An  Electronic  Voting  System 

In  this  section  we  describe  an  electronic  voting  system  which  utilizes  some  of  the  primitives  we  have 
been  discussing  in  this  chapter  and  in  earlier  chapters.  In  particular  we  make  use  of  secret  sharing 
schemes  from  Chapter  19,  commitment  schemes  from  Chapter  20,  and  zero- knowledge  proofs  from 
this  chapter.  The  purpose  is  to  show  how  basic  cryptographic  primitives  can  be  combined  into  a 
complicated  application  giving  real  value.  One  can  consider  an  electronic  voting  scheme  to  be  a 
special  form  of  secure  multi-party  computation,  a  topic  which  we  shall  return  to  in  Chapter  22. 

Our  voting  system  will  assume  that  we  have  m  voters,  and  that  there  are  n  centres  which 
perform  the  tallying.  The  use  of  a  multitude  of  tallying  centres  is  to  allow  voter  anonymity  and 
stop  a  few  centres  colluding  to  fix  the  vote.  We  shall  assume  that  voters  are  only  given  a  choice  of 
two  candidates,  for  example  Democrat  or  Republican. 

The  voting  system  we  shall  describe  will  have  the  following  seven  properties. 

(1)  Only  authorized  voters  will  be  able  to  vote. 

(2)  No  one  will  be  able  to  vote  more  than  once. 

(3)  No  stakeholder  will  be  able  to  determine  how  someone  else  has  voted. 

(4)  No  one  can  duplicate  someone  else’s  vote. 

(5)  The  final  result  will  be  correctly  computed. 

(6)  All  stakeholders  will  be  able  to  verify  that  the  result  was  computed  correctly. 

(7)  The  protocol  will  work  even  in  the  presence  of  some  bad  parties. 

System  Set-up:  Each  of  the  n  tally  centres  has  a  public  key  encryption  function  E{.  We  assume 
a  finite  abelian  group  G  is  fixed,  of  prime  order  q,  and  two  elements  g,h  G  G  are  selected  for  which 
no  party  (including  the  tally  centres)  knows  the  discrete  logarithm  h  =  gx .  Each  voter  has  a  public 
key  signature  algorithm,  with  which  they  sign  all  messages.  This  last  point  is  to  ensure  only  valid 
voters  vote,  and  we  will  ignore  this  issue  in  what  follows  as  it  is  orthogonal  to  the  points  we  want 
to  bring  out. 

Vote  Casting:  Each  of  the  m  voters  picks  a  vote  Vj  from  the  set  {—1, 1}.  The  voter  picks  a  random 
blinding  value  a3  ^TLjqTL  and  publishes  their  vote  Bj  <—  Baj(vj),  using  the  Pedersen  commitment 
scheme.  This  vote  is  public  to  all  participating  parties,  both  tally  centres  and  other  voters.  Along 
with  the  vote  Bj  the  voter  also  publishes  a  non-interactive  version  of  the  protocol  from  Section 
21.3.8  to  show  that  the  vote  was  indeed  chosen  from  the  set  {—1,1}.  The  vote  and  its  proof  are 
then  digitally  signed  using  the  signing  algorithm  of  the  voter. 

Vote  Distribution:  We  now  need  to  distribute  the  votes  cast  around  the  tally  centres  so  that  the 
final  tally  can  be  computed.  To  share  the  aj  and  vj  around  the  tallying  centres  each  voter  employs 
Shamir  secret  sharing  as  follows:  Each  voter  picks  two  random  polynomials  modulo  q  of  degree 
t  <  n, 

Rj(X)  Vj  +  rgj  •  A  H  b  rtj  •  X1 , 

Sj(X)  aj  +  sij  •  X  4  b  stj  •  X1 . 

The  voter  computes 

(uij,Wij)  =  ( Rj(i),Sj(i ))  for  1  <  i  <  n. 

The  voter  encrypts  the  pair  (uij,Wij)  using  the  zth  tally  centre’s  encryption  algorithm  E{.  This 
encrypted  share  is  sent  to  the  relevant  tally  centre.  The  voter  then  publishes  its  commitment  to 
the  polynomial  Rj(X )  by  publicly  posting  Bij  <—  BSlj{rij )  for  1  <  l  <  £,  again  using  the  earlier 
commitment  scheme. 
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Consistency  Check:  Each  centre  i  needs  to  check  that  the  values  of  (uij,Wij)  it  has  received 
from  voter  j  are  consistent  with  the  commitment  made  by  the  voter.  This  is  done  by  verifying  the 
following  equation: 


t  t 

£=1  £=  1 

t 

=  hvi  ■  ga>  ■  JJ  (hr*j  ■  gst’if 
e=i 

=  /l(^-+EkirCi-^)  .  g(°'3+T'£=lSl,j-ii) 

=  hUi-'gWi« . 


Tally  Counting:  Tally  centre  i  now  computes  and  publicly  posts  its  sum  of  the  shares  of  the  votes 
cast  Ti  =  Xqli  ui,j  5  plus  it  posts  its  sum  of  shares  of  the  blinding  factors  Ai  =  wi,j-  Every 

other  party,  both  other  centres  and  voters,  can  check  that  this  has  been  done  correctly  by  verifying 
that 


rri  /  t 

nkn8 


3= i  V  e=i 


Any  party  can  compute  the  hnal  tally  by  taking  t  of  the  values  Ti  and  interpolating  them  to  reveal 
the  hnal  tally.  This  is  because  Ti  is  the  evaluation  at  i  of  a  polynomial  which  shares  out  the  sum 
of  the  votes.  To  see  this  we  have 


m  m 

t  Uij  =  Rj(i) 

3  = 1 


H - h 


nj 


If  the  hnal  tally  is  negative  then  the  majority  of  people  voted  —1,  whilst  if  the  hnal  tally  is  positive 
then  the  majority  of  people  voted  +1.  You  should  now  convince  yourself  that  the  above  protocol 
has  the  seven  properties  we  said  it  would  at  the  beginning. 


Chapter  Summary 


•  An  interactive  proof  of  knowledge  leaks  no  information  if  the  transcript  could  be  simulated 
without  the  need  for  the  secret  information. 

•  Both  interactive  proofs  and  zero- knowledge  proofs  are  very  powerful  constructs;  they  can 
be  used  to  prove  any  statement  in  VSVACE. 

•  Interactive  proofs  of  knowledge  can  be  turned  into  digital  signature  algorithms  by  replacing 
the  challenge  by  the  hash  of  the  commitment  concatenated  with  the  message. 

•  Quite  complicated  protocols  can  then  be  built  on  top  of  our  basic  primitives  of  encryption, 
signatures,  commitment  and  zero- knowledge  proofs.  As  an  example  we  gave  an  electronic 
voting  protocol. 
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Further  Reading 

The  book  by  Goldreich  has  more  details  on  zero-knowledge  proofs,  whilst  a  good  overview  of 
this  area  is  given  in  the  first  edition  of  Stinson’s  book.  The  voting  scheme  we  describe  is  given  in 
the  paper  of  Cramer  et  al.  from  Eurocrypt. 

R.  Cramer,  M.  Franklin,  B.  Schoenmakers  and  M.  Yung.  Multi- authority  secret-ballot  elections  with 
linear  work.  In  Advances  in  Cryptology  -  Eurocrypt  1996,  LNCS  1070,  72-83,  Springer,  1996. 

O.  Goldreich.  Modern  Cryptography,  Probabilistic  Proofs  and  Pseudo-randomness.  Springer,  1999. 

D.  Stinson.  Cryptography:  Theory  and  Practice.  First  Edition.  CRC  Press,  1995. 


CHAPTER  22 


Secure  Multi-party  Computation 


Chapter  Goals 

•  To  introduce  the  concept  of  multi-party  computation. 

•  To  present  a  two-party  protocol  based  on  Yao’s  garbled-circuit  construction. 

•  To  present  a  multi-party  protocol  based  on  Shamir  secret  sharing. 

22.1.  Introduction 

Secure  multi-party  computation  is  an  area  of  cryptography  which  deals  with  two  or  more  parties 
computing  a  function  on  their  private  inputs.  They  wish  to  do  so  in  a  way  that  means  that  their 
private  inputs  still  remain  private.  Of  course  depending  on  the  function  being  computed,  some 
information  about  the  inputs  may  leak.  The  classical  example  is  the  so-called  millionaires  problem; 
suppose  a  bunch  of  millionaires  have  a  lunch  time  meeting  at  an  expensive  restaurant  and  decide 
that  the  richest  of  them  will  pay  the  bill.  However,  they  do  not  want  to  reveal  their  actual  wealth 
to  each  other.  This  is  an  example  of  a  secure  multi-party  computation.  The  inputs  are  the  values 
Xi,  which  denote  the  wealth  of  each  party,  and  the  function  to  be  computed  is 

f(x i, . . . ,  xn)  =  i  where  X{  >  Xj  for  all  i  ^  j. 

Clearly,  if  we  compute  such  a  function,  then  some  information  about  party  V s  value  leaks;  i.e.  that 
it  is  greater  than  all  the  other  values.  However,  we  require  in  secure  multi-party  computation  that 
this  is  the  only  information  which  leaks;  even  to  the  parties  participating  in  the  protocol. 

One  can  consider  a  number  of  our  previous  protocols  as  being  examples  of  secure  multi-party 
computation.  For  example,  the  voting  protocol  given  previously  involves  the  computation  of  the 
result  of  each  party  voting,  without  anyone  learning  the  vote  being  cast  by  a  particular  party. 
Encryption  is  a  multi-party  computation  between  three  parties:  the  sender,  the  receiver  and  the 
adversary.  Only  the  sender  has  an  input  (which  is  the  message  to  be  encrypted)  and  only  the 
receiver  has  an  output  (which  is  the  message  when  decrypted).  In  fact  we  can  essentially  see  all  of 
cryptography  as  some  form  of  multi-party  computation. 

One  solution  to  securely  evaluating  a  function  is  for  all  the  parties  to  send  their  inputs  to  a 
trusted  third  party.  This  trusted  party  then  computes  the  function  and  passes  the  output  back  to 
the  parties.  However,  we  want  to  remove  such  a  trusted  third  party  entirely.  Intuitively  a  multi¬ 
party  computation  is  said  to  be  secure  if  the  information  which  is  leaked  is  precisely  that  which 
would  have  leaked  if  the  computation  had  been  conducted  by  encrypting  messages  to  a  trusted 
third  party. 

This  is  not  the  only  security  issue  that  needs  to  be  addressed  when  considering  secure  multi¬ 
party  computation.  There  are  two  basic  security  models: 

•  In  the  first  model  the  parties  are  guaranteed  to  follow  the  protocols,  but  are  interested 
in  breaking  the  privacy  of  their  fellow  participants.  Such  adversaries  are  called  honest- 
but-curious,  and  they  in  some  sense  correspond  to  passive  adversaries  in  other  areas  of 
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cryptography.  Whilst  honest-but-curious  adversaries  follow  the  protocol,  a  number  of  them 
could  combine  their  different  internal  data  so  as  to  subvert  the  security  of  the  non-corrupt 
parties. 

•  In  the  second  model  the  adversaries  can  deviate  from  the  protocol  and  may  wish  to  pass 
incorrect  data  around  so  as  to  subvert  the  computation  of  the  function.  Again  we  allow 
such  adversaries  to  talk  to  each  other  in  a  coalition.  Such  an  adversary  is  called  a  malicious 
adversary.  In  such  situations  we  would  like  the  protocol  to  still  complete,  and  compute 
the  correct  function,  i.e.  it  should  be  both  correct  and  robust.  In  this  book  we  will  not 
discuss  modern  protocols  which  trade  robustness  for  other  benefits. 

There  is  a  problem  though.  If  we  assume  that  communication  is  asynchronous,  which  is  the  most 
practically  relevant  situation,  then  some  party  must  go  last.  In  such  a  situation  one  party  may 
have  learnt  the  outcome  of  the  computation,  but  one  party  may  not  have  the  value  yet  (namely 
the  party  which  receives  the  last  message).  Any  malicious  party  can  clearly  subvert  the  protocol 
by  not  sending  the  last  message.  Usually  malicious  adversaries  are  assumed  not  to  perform  such 
an  attack.  A  protocol  which  is  said  to  be  secure  against  an  adversary  which  can  delete  the  final 
message  is  said  to  be  fair. 

In  what  follows  we  shall  mainly  explain  the  basic  ideas  behind  secure  multi-party  computation 
in  the  case  of  honest-but-curious  adversaries.  We  shall  touch  on  the  case  of  malicious  adversaries 
for  one  of  our  examples  though,  as  it  provides  a  nice  example  of  an  application  of  various  properties 
of  Shamir  secret  sharing. 

If  we  let  n  denote  the  number  of  parties  which  engage  in  the  protocol,  we  would  like  to  create 
protocols  for  secure  multi-party  computation  which  are  able  to  tolerate  a  large  number  of  corrupt 
parties.  It  turns  out  that  there  is  a  theoretical  limit  on  the  number  of  parties  whose  corruption  can 
be  tolerated. 

•  For  the  case  of  honest-but-curious  adversaries  we  can  tolerate  fewer  than  n/2  corrupt 
parties,  for  computationally  unbounded  adversaries. 

•  If  we  restrict  ourselves  to  computationally  bounded  adversaries  then  we  can  tolerate  up 
to  n  —  1  corrupt  parties  in  the  case  of  honest-but-curious  adversaries. 

•  However,  for  a  malicious  adversary  we  can  tolerate  up  to  n/3  corrupt  parties  if  we  assume 
computationally  unbounded  adversaries. 

•  If  we  assume  computationally  bounded  adversaries  we  can  only  tolerate  less  than  n/2, 
unless  we  are  prepared  to  accept  an  unfair /unrobust  protocol  in  which  case  we  can  tolerate 
up  to  n  —  1  corrupt  parties. 

Protocols  for  secure  multi-party  computation  usually  fall  into  one  of  two  distinct  families.  The 
first  is  based  on  an  idea  of  Yao  called  a  garbled  circuit  or  Yao  circuit:  in  this  case  one  presents  the 
function  to  be  computed  as  a  binary  circuit,  and  then  one  “encrypts”  the  gates  of  this  circuit  to 
form  the  garbled  circuit.  This  approach  is  clearly  based  on  a  computational  assumption,  i.e.  that 
the  encryption  scheme  is  secure.  The  second  approach  is  based  on  secret  sharing  schemes:  here  one 
usually  represents  the  function  to  be  computed  as  an  arithmetic  circuit.  In  this  second  approach 
one  uses  a  perfectly  secure  secret  sharing  scheme  to  obtain  perfect  security. 

It  turns  out  that  the  first  approach  seems  better  suited  to  the  case  where  there  are  two  parties, 
whilst  the  second  approach  seems  better  suited  to  the  case  of  three  or  more  parties.  In  our  discussion 
below  we  will  present  a  computationally  secure  solution  for  the  two-party  case  in  the  presence  of 
honest-but-curious  adversaries,  based  on  Yao  circuits.  This  approach  can  be  extended  to  more 
than  two  parties  and  a  malicious  adversary,  but  doing  this  is  beyond  the  scope  of  this  book.  We 
then  present  a  protocol  for  the  multi-party  case  which  is  perfectly  secure.  We  sketch  two  versions, 
one  which  provides  security  against  honest-but-curious  adversaries  and  one  which  provides  security 
against  malicious  adversaries. 
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22.2.  The  Two-Party  Case 

We  shall  in  this  section  consider  the  method  of  secure  multi-party  computation  based  on  garbled 
circuits.  We  suppose  there  are  two  parties  A  and  B  with  inputs  x  and  y  respectively,  and  that  A 
wishes  to  compute  y)  and  B  wishes  to  compute  /g(x,  y).  Recall  this  needs  to  be  done  without 
B  learning  anything  about  x  or  except  what  he  can  deduce  from  fs(x,y)  and  y,  with  a 

similar  privacy  statement  applying  to  A. 

First  note  that  it  is  enough  for  B  to  receive  the  output  of  a  related  function  /.  To  see  this  we 
let  A  have  an  extra  secret  input  k  which  is  as  long  as  the  maximum  output  of  her  function  /^(x,  y). 
If  we  can  create  a  protocol  in  which  B  learns  the  value  of  the  function 


f(x,  y,  k)  =  (k®  fA(x,  y),  fB(x,  y)), 

then  B  simply  sends  the  value  of  k  ©  fA(x,y)  back  to  A  who  can  then  decrypt  it  using  k,  and 
so  determine  Hence,  we  will  assume  that  there  is  only  one  function  which  needs  to  be 

computed  and  that  its  output  will  be  determined  by  B. 

So  suppose  f(x,y)  is  the  function  which  is  to  be  computed;  we  will  assume  that  f(x,y)  can 
be  computed  in  polynomial  time.  Therefore  there  is  also  a  polynomial-sized  binary  circuit  which 
will  also  compute  the  output  of  the  function.  In  the  forthcoming  example  we  will  write  out  such  a 
circuit,  and  so  in  Figure  22.1  we  recall  the  standard  symbols  for  a  binary  circuit. 


AND 


NAND 


Figure  22.1.  The  basic  logic  gates 


A  binary  circuit  can  be  represented  by  a  collection  of  wires  W  =  {w i, . . . ,  wn}  and  a  collection 
of  gates  G  =  {gi, . . .  ,gm}-  Each  gate  is  a  function  which  takes  as  input  the  values  of  two  wires, 
and  produces  the  value  of  the  output  wire.  For  example  suppose  g\  is  an  AND  gate  which  takes  as 
input  wires  w\  and  and  produces  the  output  wire  W3 .  Then  gate  g\  can  be  represented  by  the 
following  truth  table. 
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W\ 

W2 

w  3 

0 

0 

0 

0 

1 

0 

1 

0 

0 

1 

1 

1 

In  other  words,  the  gate  gi  represents  a  function,  W3  <—  ^(^1,^2)  such  that 

0  =  gi( 0,  0)  =  gi{  1,  0)  =  #;(0, 1)  and  1  =  gt{  1, 1). 

22.2.1.  Garbled  Circuit  Construction:  In  Yao’s  protocol  one  party  constructs  a  garbled  circuit 
(we  shall  call  this  party  A),  the  other  party  evaluates  the  garbled  circuit  (we  shall  call  this  party 
B).  The  garbled  circuit  is  constructed  as  follows: 

•  For  each  wire  Wi  two  random  cryptographic  keys  are  selected,  h®  and  kj.  The  first  one 
represents  the  encryption  of  the  zero  value  and  the  second  represents  the  encryption  of 
the  one  value. 

•  For  each  wire  a  random  value  pi  E  {0, 1}  is  chosen.  This  is  used  to  also  encrypt  the  actual 
wire  value.  If  the  actual  wire  value  is  V{  then  the  encrypted,  or  “external”  value,  is  given 

by  ei  =  vi  ©  pi. 

•  For  each  gate  we  compute  a  “garbled  table”  representing  the  function  of  the  gate  on  these 
encrypted  values.  Suppose  gi  is  a  gate  with  input  wires  Wi0  and  wp  and  output  wire  Wi2 , 
then  the  garbled  table  is  the  following  four  values,  for  some  encryption  function  E : 

Ca'b  =  E.“<SPi 0  fcepn  (Ctll  (°a,b  ©  Pi2))  for  e  {0, 1}. 

kwi0  >KWii 

where  oa,b  =  gi  (a  ©  pio ,  b  ©  ph ) . 

We  do  not  consider  exactly  what  encryption  function  is  chosen;  such  a  discussion  is  slightly  beyond 
the  scope  of  this  book.  If  you  want  further  details  then  look  in  the  Further  Reading  section  at  the 
end  of  this  chapter,  or  just  assume  we  take  an  encryption  scheme  which  is  suitably  secure. 

The  above  may  seem  rather  confusing  so  we  illustrate  the  method  for  constructing  the  garbled 
circuit  with  an  example.  Suppose  A  and  B  each  have  as  input  two  bits;  we  shall  denote  A’ s  input 
wires  by  w\  and  iC2,  whilst  we  shall  denote  B’ s  input  wires  by  w%  and  w 4.  Suppose  they  now  wish 
to  engage  in  a  secure  multi-party  computation  so  that  B  learns  the  value  of  the  function 

/({wi,  w2},  {w3,  w4})  =  (wi  A  W3)  V  (W2  ©  Wi). 

A  circuit  to  represent  this  function  is  given  in  Figure  22.2. 

In  Figure  22.2  we  also  present  the  garbled  values  of  each  wire  and  the  corresponding  garbled 
tables  representing  each  gate.  In  this  example  we  have  the  following  values  of  pp 

Pi  =  P4  =  P6  =  P7  =  1  and  p2  =  ps  =  Ps  =  0. 

Consider  the  first  wire;  the  two  garbled  values  of  the  wire  are  k J  and  h\,  which  represent  the  0  and 
1  values,  respectively.  Since  pi  =  1,  the  external  value  of  the  internal  0  value  is  1  and  the  external 
value  of  the  internal  1  value  is  0.  Thus  we  represent  the  garbled  value  of  the  wire  by  the  pair  of 
pairs 

(fc?||l,  k\  || 0) . 

Now  we  look  at  the  gates,  and  in  particular  consider  the  first  gate.  The  first  gate  is  an  AND  gate 
which  takes  as  input  the  first  and  third  wires.  The  first  entry  in  this  table  corresponds  to  a  =  b  =  0. 
Now  the  p  values  for  the  first  and  third  wires  are  1  and  0  respectively.  Hence,  the  first  entry  in  the 
table  corresponds  to  what  should  happen  if  the  keys  k\  and  /C3  are  seen,  since 

l  =  l®0  =  pi®a  and  0  =  0  ®  0  =  pa  ®  b. 
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fc?  ||1 


Figure  22.2.  A  garbled  circuit 

Now  the  AND  gate  should  produce  the  0  output  on  input  of  1  and  0,  thus  the  thing  which  is 
encrypted  in  the  first  line  is  the  key  representing  the  zero  value  of  the  fifth  wire,  i.e.  k®,  plus  the 
“external  value”  of  0,  namely  0  =  0®0  =  0®  p$. 

22.2.2.  Garbled  Circuit  Evaluation:  We  now  describe  how  the  circuit  is  evaluated  by  party 
B.  Please  refer  to  Figure  22.3  for  a  graphical  description  of  this.  We  assume  that  B  has  obtained 
in  some  way  the  specific  garbled  values  of  the  input  wires  marked  in  blue  in  Figure  22.3,  and  the 
value  of  pi  for  the  output  wires;  in  our  example  this  is  just  p 7.  Firstly  party  B  evaluates  the  AND 
gate;  he  knows  that  the  external  value  of  wire  one  is  1  and  the  external  value  of  wire  three  is  1. 
Thus  he  looks  up  the  entry  c\  l  in  the  table  and  decrypts  it  using  the  two  keys  he  knows,  i.e.  k J 
and  k\.  He  then  obtains  the  value  k®\\ 0.  He  has  no  idea  whether  this  represents  the  zero  or  one 
value  of  the  fifth  wire,  since  he  has  no  idea  as  to  the  value  of  p$. 

Party  B  then  performs  the  same  operation  with  the  exclusive-or  gate.  This  has  input  wire  2 
and  wire  4,  for  which  party  B  knows  that  the  external  values  are  0  and  1  respectively.  Thus  party 
B  decrypts  the  entry  Cq  1  to  obtain  /cg||l.  A  similar  procedure  is  then  carried  out  with  the  final  OR 
gate,  using  the  keys  and  external  values  of  the  fifth  and  sixth  wires.  This  results  in  a  decryption 
which  reveals  the  value  A^Hl.  So  the  external  value  of  the  seventh  wire  is  equal  to  1,  but  party  B 
has  been  told  that  P7  =  1,  and  hence  the  internal  value  of  wire  seven  will  be  0  =  1  0  1.  Hence,  the 
output  of  the  function  is  the  bit  0. 

22.2.3.  Yao’s  Protocol:  We  are  now  in  a  position  to  describe  Yao’s  protocol  in  detail.  The 
protocol  proceeds  in  five  phases  as  follows: 

(1)  Party  A  generates  the  garbled  circuit  as  above,  and  transmits  to  party  B  only  the  values 
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fc?||l 


Figure  22.3.  Evaluating  a  garbled  circuit 


(4) 

(5) 


Party  A  then  transmits  to  party  B  the  garbled  values  of  the  component  of  its  input  wires. 
For  example,  suppose  that  party  A’s  input  is  w\  =  0  and  W2  =  0.  Then  party  A  transmits 
to  party  B  the  two  values  £^||1  and  k^W 0.  Note  that  party  B  cannot  learn  the  actual  values 
of  w\  and  W2  from  these  values  since  he  does  not  know  pi  and  p2 ,  and  the  keys  k®  and  k® 
just  look  like  random  keys. 

Party  A  and  B  then  engage  in  an  oblivious  transfer  protocol,  as  in  Section  20.3,  for  each 
of  party  B’s  input  wires.  In  our  example  suppose  that  party  B’s  input  is  w%  =  1  and 
W4  =  0.  The  two  parties  execute  two  oblivious  transfer  protocols,  one  with  A’s  input  ||0 
and  1 1 1 ,  and  B’s  input  1,  and  one  with  A’s  input  A^Hl  and  k\\\ 0,  and  B’s  input  0.  At 
the  end  of  this  oblivious  transfer  phase  party  B  has  learnt  k^\\  1  and  k®\\  1. 

Party  A  then  transmits  to  party  B  the  values  of  pi  for  all  of  the  output  wires.  In  our 
example  he  reveals  the  value  of  P7  =  1. 

Finally  party  B  evaluates  the  circuit  using  the  garbled  input  wire  values  he  has  been  given, 
using  the  technique  described  above. 


In  summary,  in  the  first  stage  all  party  B  knows  about  the  garbled  is  in  the  blue  items  in  Figure 
22.2,  but  by  the  last  stage  he  knows  the  blue  items  in  Figure  22.3. 

In  our  example  we  can  now  assess  what  party  B  has  learnt  from  the  computation.  Party  B 
knows  that  the  output  of  the  final  OR  gate  is  zero,  which  means  that  the  inputs  must  also  be  zero, 
which  means  that  the  output  of  the  AND  gate  is  zero  and  the  output  of  the  exclusive-or  gate  is 
zero.  However,  party  B  already  knew  that  the  output  of  the  AND  gate  will  be  zero,  since  his  own 
input  was  zero.  However,  party  B  has  learnt  that  party  A’s  second  input  wire  represented  zero, 
since  otherwise  the  exclusive-or  gate  would  not  have  output  zero.  So  whilst  party  A’s  first  input 
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remains  private,  the  second  input  does  not.  This  is  what  we  meant  by  a  protocol  keeping  the  inputs 
private,  bar  what  could  be  deduced  from  the  output  of  the  function. 

22.3.  The  Multi-party  Case:  Honest-but-Curious  Adversaries 

The  multi-party  case  is  based  on  using  a  secret  sharing  scheme  to  evaluate  an  arithmetic  circuit. 
An  arithmetic  circuit  consists  of  a  finite  held  ¥q  and  a  polynomial  function  (which  could  have  many 
inputs  and  outputs)  defined  over  the  finite  held.  The  idea  is  that  such  a  function  can  be  evaluated 
by  executing  a  number  of  addition  and  multiplication  gates  over  the  hnite  held. 

Given  an  arithmetic  circuit  it  is  clear  one  could  express  it  as  a  binary  circuit,  by  simply  ex¬ 
panding  out  the  addition  and  multiplication  gates  of  the  arithmetic  circuit  as  their  binary  circuit 
equivalents.  One  can  also  represent  every  binary  circuit  as  an  arithmetic  circuit,  since  every  gate 
in  the  binary  circuit  can  be  represented  as  a  linear  function  of  the  input  values  to  the  gate  and 
their  products.  For  example,  suppose  we  represent  the  binary  values  0  and  1  by  0  and  1  in  the 
hnite  held  Fg,  and  that  the  characteristic  of  ¥q  is  larger  than  two.  We  then  have  that  the  binary 
exclusive-or  gate  can  be  written  as  x  0  y  =  —2  •  x  •  y  +  x  +  y  over  ¥q,  and  the  binary  “and”  gate 
can  be  written  as  x  ©  y  =  x  •  y. 

Whilst  the  two  representations  are  equivalent  it  is  clear  that  some  functions  are  easier  to 
represent  as  binary  circuits  and  some  are  easier  to  represent  as  arithmetic  circuits. 

As  before  we  shall  present  the  protocol  via  a  running  example.  We  shall  suppose  we  have  six 
parties  Pi, . . . ,  Pq  who  have  six  secret  values  aq, . . . ,  xq,  each  of  which  he  in  ¥p,  for  some  reasonably 
large  prime  p.  For  example  we  could  take  p  ~  2128,  but  in  our  example  to  make  things  easier 
to  represent  we  will  take  p  =  101.  The  parties  are  assumed  to  want  to  compute  the  value  of  the 
function 

f(x i, . . . ,  xq)  =  x\  •  X2  +  xs  •  X4  +  X5  •  xq  (mod  p). 

Hence,  the  arithmetic  circuit  for  this  function  consists  of  three  multiplication  gates  and  two  addition 
gates,  as  in  Figure  22.4,  where  we  label  the  intermediate  values  as  numbered  “wires”. 


Figure  22.4.  Graphical  representation  of  the  example  arithmetic  circuit 

We  will  use  Shamir’s  secret  sharing,  in  which  case  our  basic  protocol  is  as  follows:  The  value 
of  each  wire  X{  is  shared  between  all  players,  with  each  player  j  obtaining  a  share  Clearly,  if 

enough  players  come  together,  then  they  can  determine  the  value  of  the  wire  by  the  properties 
of  the  secret  sharing  scheme. 

Each  player  can  create  shares  of  his  or  her  own  input  values  at  the  start  of  the  protocol  and 
send  a  share  to  each  player.  Thus  we  need  to  show  how  to  obtain  the  shares  of  the  outputs  of  a 
gate,  given  shares  of  the  inputs  of  the  gate.  Recall  that  in  Shamir’s  secret  sharing  the  shared  value 
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is  given  by  the  constant  term  of  a  polynomial  /  of  degree  t,  with  the  shares  being  the  evaluation 
of  the  polynomial  at  given  positions  corresponding  to  each  participant  f(i). 

22.3.1.  Addition  Gates:  First  we  consider  how  to  compute  the  Add  gates.  Suppose  we  have  two 
secrets  a  and  b  which  are  shared  using  the  polynomials 

f(x )  =  a  +  h  ■  X  +  •  ■  •  +  ft  ■  X\ 
g{X)  =  b  +  gi  ■  X  4 - V  gt-  X1. 

Each  of  our  parties  has  a  share  =  f(i)  and  1/ ' 1  =  g(i).  Now  consider  the  polynomial 

h(X)  =  f(X)+g(X). 

This  polynomial  provides  a  sharing  of  the  sum  c  =  a  +  b,  and  we  have 

c«  =  h(i)  =  f(i)  +  g(i)  =  +  b^\ 

Hence,  the  parties  can  compute  a  sharing  of  the  output  of  an  Add  gate  without  any  form  of  com¬ 
munication  between  them. 

22.3.2.  Multiplication  Gates:  Computing  the  output  of  a  Mult  gate  is  more  complicated.  First 
we  recap  the  following  property  of  Lagrange  interpolation.  If  f(X)  is  a  polynomial  and  we  distribute 
the  values  f(i)  then  there  is  a  vector  (rr, . . . ,  rn),  called  the  recombination  vector,  such  that 

n 

f(°)  =  TT'  /(*)• 

i=i 

And  the  same  vector  works  for  all  polynomials  f(X)  of  degree  at  most  n  —  1. 

To  compute  the  Mult  gate  we  perform  the  following  four  steps.  We  assume  as  input  that  each 
party  has  a  share  of  a  and  b  via  =  f(i)  and  =  g(i ),  where  /(0)  =  a  and  g( 0)  =  b.  We  wish 
to  compute  a  sharing  =  h(i)  such  that  h( 0)  =  c  =  a  •  b. 

•  Each  party  locally  computes  •  b^L\ 

•  Each  party  produces  a  polynomial  5i(X)  of  degree  at  most  t  such  that  <^(0)  =  d^L\ 

•  Each  party  i  distributes  to  party  j  the  value  djj  =  Si(j). 

•  Each  party  j  computes  c ^  Ti  ' 

So  why  does  this  work?  Consider  the  first  step;  here  we  are  actually  effectively  computing  a 
polynomial  hf(X)  of  degree  at  most  2  •  £,  with  d^  =  /i7(z),  and  c  =  h'( 0).  Hence,  the  only  problem 
with  the  sharing  in  the  first  step  is  that  the  underlying  polynomial  has  too  high  a  degree.  The 
main  thing  to  note  is  that  if 

(23)  2  •  t  <  n  -  1 

then  we  have  c  =  Xu=i  ri  *  Now  consider  the  polynomials  Si(X)  generated  in  the  second  step, 
and  consider  what  happens  when  we  recombine  them  using  the  recombination  vector,  i.e.  set 

n 

/l(X)  =  •  ^(X). 

i=l 

Since  the  5i(X)  are  all  of  degree  at  most  £,  the  polynomial  h(X)  is  also  of  degree  at  most  t.  We 
also  have  that 

n  n 

=  E  n  ■  =  E  n  ■ =  c’ 

1=1  x=i 
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assuming  2  •  t  <  n  —  1.  Thus  h(X)  is  a  polynomial  which  could  be  used  to  share  the  value  of  the 
product.  Not  only  that,  but  it  is  the  polynomial  underlying  the  sharing  produced  in  the  final  step. 
To  see  this  notice  that 

n  n 

h(j)  =  N n  '  Mi)  =  X] n  '  di’i  =  cW- 

i—1  i= 1 

Example:  So  assuming  t  <  n/2  we  can  produce  a  protocol  which  evaluates  the  arithmetic  circuit 
correctly.  We  illustrate  the  method  by  examining  what  would  happen  for  our  example  circuit  in 
Figure  22.4,  with  p  =  101.  Recall  that  there  are  six  parties;  we  shall  assume  that  their  inputs  are 
given  by 

x\  —  20,  X2  =  40,  xg  =  21,  X4  =  31,  x$  =  1,  xq  =  71. 

Each  party  first  computes  a  sharing  of  their  secret  amongst  the  six  parties.  They  do  this  by 
each  choosing  a  random  polynomial  of  degree  t  —  2  and  evaluating  it  at  j  =  1,2,  3, 4,  5,6.  The 
values  obtained  are  then  distributed  securely  to  each  party.  Hence,  each  party  obtains  its  row  of 
the  following  table. 


3 

1 

2 

i 

3 

4 

5 

6 

1 

44 

2 

96 

23 

86 

83 

2 

26 

0 

63 

13 

52 

79 

3 

4 

22 

75 

62 

84 

40 

4 

93 

48 

98 

41 

79 

10 

5 

28 

35 

22 

90 

37 

65 

6 

64 

58 

53 

49 

46 

44 

As  an  exercise  you  should  work  out  the  associated  polynomials  corresponding  to  each  column. 

The  parties  then  engage  in  the  multiplication  protocol  so  as  to  compute  sharings  of  xj  =  x\  -X2- 
They  first  compute  their  local  multiplication,  by  each  multiplying  the  first  two  elements  in  their 
row  of  the  above  table,  then  they  form  a  sharing  of  this  local  multiplication.  These  sharings  of  six 
numbers  between  six  parties  are  then  distributed  securely.  In  our  example  run  each  party,  for  this 
multiplication,  obtains  the  sharings  given  by  its  column  of  the  following  table. 


i 

1 

2 

3 

3 

4 

5 

6 

1 

92 

54 

20 

91 

65 

43 

2 

10 

46 

7 

95 

7 

46 

3 

64 

100 

96 

52 

69 

46 

4 

23 

38 

41 

32 

11 

79 

5 

47 

97 

77 

88 

29 

1 

6 

95 

34 

11 

26 

79 

69 

Each  party  then  takes  the  six  values  obtained  and  recovers  their  share  of  the  value  of  x^.  We  find 
that  the  six  shares  of  xj  are  given  by 

xj^  =  9,  xj^  =  97,  xj^  =  54,  x^  =  82,  xj^  =  80,  x 7^  =  48. 

Repeating  the  multiplication  protocol  twice  more  we  also  obtain  a  sharing  of  x$  as 

=  26,  xg^  =  91,  X8^  =  38,  X8^  =  69,  xg^  =  83,  xg^  =  80, 

and  xq  as 

x9(1)  =  57,  x9(2)  =  77,  a;9(3)  =  30,  x9(4)  =  17,  2:9 (5)  =  38,  x9(6)  =  93. 
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We  are  then  left  with  the  two  addition  gates  to  produce  the  sharings  of  the  wires  x\$  and  x\\. 
These  are  obtained  by  locally  adding  together  the  various  shared  values  so  that 

=  x^  +  x^  +  (mod  101)  =  9  +  26  +  57  (mod  101)  =  92, 

etc.  to  obtain 

xn(1)  =  92,  xn{2)  =  63,  £11+  =  21,  xn(4)  =  67,  xn{5)  =  100,  xn(6)  =  19. 

The  parties  then  make  public  these  shares,  and  recover  the  hidden  polynomial,  of  degree  t  =  2, 
which  produces  these  sharings,  namely  7  +  41  •  X  +  44  •  X 2.  Hence,  the  result  of  the  multi-party 
computation  is  the  value  7. 

Now  assume  that  more  than  t  parties  are  corrupt,  in  the  sense  that  they  collude  to  try  to 
break  the  privacy  of  the  non-corrupted  parties.  The  corrupt  parties  can  now  come  together  and 
recover  any  of  the  underlying  secrets  in  the  scheme,  since  we  have  used  Shamir  secret  sharing  using 
polynomials  of  degree  at  most  t.  It  can  be  shown,  using  the  perfect  secrecy  of  the  Shamir  secret 
sharing  scheme,  that  as  long  as  no  more  than  t  parties  are  corrupt  then  the  above  protocol  is 
perfectly  secure. 

However,  it  is  only  perfectly  secure  assuming  all  parties  follow  the  protocol,  i.e.  we  are  in  the 
honest-but-curious  model.  As  soon  as  we  allow  parties  to  deviate  from  the  protocol,  they  can  force 
the  honest  parties  to  produce  invalid  results.  To  see  this  just  notice  that  a  dishonest  party  could 
simply  produce  an  invalid  sharing  of  its  product  in  the  second  part  of  the  multiplication  protocol 
above. 


22.4.  The  Multi-party  Case:  Malicious  Adversaries 


To  produce  a  scheme  which  is  secure  against  active  adversaries  either  we  need  to  force  all  parties 
to  follow  the  protocol  or  we  should  be  able  to  recover  from  errors  which  malicious  parties  introduce 
into  the  protocol.  It  is  the  second  of  these  two  approaches  which  we  shall  follow  in  this  section,  by 
using  the  error  correction  properties  of  the  Shamir  secret  sharing  scheme.  As  already  remarked,  the 
above  protocol  is  not  secure  against  malicious  adversaries,  due  to  the  ability  of  an  attacker  to  make 
the  multiplication  protocol  output  an  invalid  answer.  To  make  the  above  protocol  secure  against 
malicious  adversaries  we  make  use  of  various  properties  of  the  Shamir  secret  sharing  scheme. 

The  protocol  runs  in  two  stages:  The  preprocessing  stage  does  not  involve  any  of  the  secret 
inputs  of  the  parties,  it  depends  purely  on  the  number  of  multiplication  gates  in  the  circuit.  In  the 
main  phase  of  the  protocol  the  circuit  is  evaluated  as  in  the  previous  section,  but  using  a  slightly 
different  multiplication  protocol.  Malicious  parties  can  force  the  preprocessing  stage  to  fail,  however 
if  it  completes  then  the  honest  parties  will  be  able  to  evaluate  the  circuit  as  required. 

The  preprocessing  phase  runs  as  follows.  First,  using  the  techniques  from  Chapter  19,  a  pseudo¬ 
random  secret  sharing  scheme,  PRSS,  and  a  pseudo-random  zero  sharing  scheme,  PRZS,  are  set 
up.  Then  for  each  multiplication  gate  in  the  circuit  we  compute  a  random  triple  of  sharings  a^\ 
and  such  that  c  =  a  •  b.  This  is  done  as  follows: 


•  Using  PRSS  generate  two  random  sharings,  and  b^\  of  degree  t. 

•  Using  PRSS  generate  another  random  sharing  of  degree  t. 

•  Using  PRZS  generate  a  sharing  z^\  of  degree  2  •  t  of  zero. 

•  Each  party  then  locally  computes  •  b^  —  +  z^\  Note  that  this  local  compu¬ 

tation  will  produce  a  degree  2  •  t  sharing  of  the  value  s  =  a  •  b  —  r. 

•  Then  the  players  broadcast  their  values  sW  and  try  to  recover  s.  Here  we  make  use  of  the 
error  detection  properties  of  Reed-Solomon  codes.  If  the  number  of  malicious  parties  is 
bounded  by  t  <  n/3,  then  any  error  in  the  degree- (2  •  t)  sharing  will  be  detected.  At  this 
stage  the  parties  abort  the  protocol  if  any  error  is  found. 

•  Now  the  players  locally  compute  the  shares  from  =  s  + 
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Assuming  the  above  preprocessing  phase  completes  successfully,  all  we  need  to  do  is  specify 
how  the  parties  implement  a  Mult  in  the  presence  of  malicious  adversaries.  We  assume  the  inputs 
to  the  multiplication  gate  are  given  by  x^  and  and  we  wish  to  compute  a  sharing  of  the 
product  z  =  x  -  y.  From  the  preprocessing  stage,  the  parties  also  have  for  each  gate,  a  triple  of 
shares  af^\  &W  and  such  that  c  =  a  •  b.  The  protocol  for  the  multiplication  is  then  as  follows: 

•  Compute  locally,  and  then  broadcast,  the  values  =  x^  —  and  —  b^\ 

•  Reconstruct  the  values  of  d  =  x  —  a  and  e  =  y  —  b. 

•  Locally  compute  the  shares  =  d  •  e  +  d  •  +  e  •  . 

Note  that  the  reconstruction  in  the  second  step  can  be  completed,  even  if  the  corrupt  parties 
transmit  invalid  values,  as  long  as  there  are  at  most  t  <  n/3  malicious  parties.  Due  to  the 
error  correction  properties  of  Reed-Solomon  codes,  we  can  recover  from  any  errors  introduced  by 
malicious  parties.  The  above  protocol  produces  a  valid  sharing  of  the  output  of  the  multiplication 
gate  because 

d-e  +  d*6  +  e*a  +  c=(T  —  a)  -  (y  —  b)  +  (x  —  a)  -  b  +  (y  —  b)  -  a  +  c 

=  {{x  -  a)  +  a)  •  ((y  -  b)  +  b) 

=  x  ■  y  =  z. 


Chapter  Summary 


•  We  have  explained  how  to  perform  two-party  secure  computation,  in  the  case  of  honest- 
but-curious  adversaries,  using  Yao’s  garbled-circuit  construction. 

•  For  the  multi-party  case  we  have  presented  a  protocol  based  on  evaluating  arithmetic,  as 
opposed  to  binary,  circuits  which  is  based  on  Shamir  secret  sharing. 

•  The  main  issue  with  this  latter  protocol  is  how  to  evaluate  the  multiplication  gates.  We 
presented  two  methods:  The  first,  simpler,  method  is  applicable  when  one  is  only  dealing 
with  honest-but-curious  adversaries,  the  second,  more  involved,  method  is  for  the  case  of 
malicious  adversaries. 


Further  Reading 

The  original  presentation  of  Yao’s  idea  was  apparently  given  in  the  talk  which  accompanied  the 
paper  in  FOCS  1986,  however  the  paper  contains  no  explicit  details  of  the  protocol.  It  can  be 
transformed  into  a  scheme  for  malicious  adversaries  using  a  general  technique  of  Goldreich  et  al. 
The  discussion  of  the  secret  sharing  based  solution  for  the  honest  and  malicious  cases  closely  follows 
the  treatment  in  Damgard  et  al. 

I.  Damgard,  M.  Geisler,  M.  Krpigaard  and  J.B.  Nielsen.  Asynchronous  multiparty  computation: 
Theory  and  implementation  In  Public  Key  Cryptography  -  PKC  2009,  LNCS  5443,  160-179, 
Springer,  2009. 
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Appendix 


Basic  Mathematical  Terminology 


This  appendix  is  presented  as  a  series  of  notes  which  summarize  most  of  the  mathematical 
terminology  needed  in  this  book.  We  present  the  material  in  a  more  formal  manner  than  we  did  in 
Chapter  1  and  the  rest  of  the  book. 


A.l.  Sets 

Here  we  recap  some  basic  definitions  etc.  which  we  list  here  for  completeness. 

Definition  100.1  (Set  Union,  Intersection,  Difference  and  Cartesian  Product).  For  two  sets  A ,  B 
we  define  the  union,  intersection,  difference  and  Cartesian  product  by 

AU  B  =  {x  :  x  C  A  or  x  G  B }, 

An  B  =  {x  :  x  £  A  and  x  G  B}, 

A  \  B  =  {x  :  x  C  A  and  x  0  B }, 

A  x  B  =  {(x,  y)  :  x  G  A  and  y  G  B}. 

The  statement  A  C  B  means  that  for  all  x  G  A  it  follows  that  x  G  B. 

Using  these  definitions  one  can  prove  in  a  standard  way  all  the  basic  results  of  set  theory  that  one 
shows  in  school  using  Venn  diagrams. 

Lemma  100.2.  If  AC.  B  and  B  C  C  then  ACC. 

Proof.  Let  x  be  an  element  of  A;  we  wish  to  show  that  x  is  an  element  of  C.  Now  as  A  C  B  we 
have  that  x  G  B,  and  as  B  C  C  we  then  deduce  that  x  G  C.  □ 

Notice  that  this  is  a  proof  whereas  an  argument  using  Venn  diagrams  to  demonstrate  something  is 

not  a  proof.  Using  Venn  diagrams  to  show  something  merely  shows  you  were  not  clever  enough  to 
come  up  with  a  picture  which  proved  the  result  false. 

There  are  some  standard  sets  which  will  be  of  interest  in  our  discussions:  N  the  set  of  natural 
numbers,  {0, 1,  2,  3, 4, . . .};  Z  the  set  of  integers,  {0,  ±1,  ±2,  ±3,  ±4, . . Q  the  set  of  rational 
numbers,  { p/q  :  p  G  Z,  q  G  N  \  {0}};  M  the  set  of  real  numbers;  C  the  set  of  complex  numbers. 

A. 2.  Relations 

Next  we  define  relations  and  some  properties  that  they  have.  Relations,  especially  equivalence 
relations,  play  an  important  part  in  algebra  and  it  is  worth  considering  them  at  this  stage  so  it  is 
easier  to  understand  what  is  going  on  later. 

Definition  100.3  (Relation).  A  (binary)  relation  on  a  set  A  is  a  subset  of  the  Cartesian  product 
A  x  A. 

This  we  explain  with  an  example:  Consider  the  relationship  “less  than  or  equal  to”  between  natural 
numbers.  This  obviously  gives  us  the  set 

LE  =  {(x,y)  :  x,  y  G  N,  x  is  less  than  or  equal  to  y}. 
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In  much  the  same  way  every  relationship  that  you  have  met  before  can  be  written  in  this  set- 
theoretic  way.  An  even  better  way  to  put  the  above  is  to  define  the  relation  “less  than  or  equal  to” 
to  be  the  set 

LE  =  {(x,  y)  :x,y  £  N,  x  -  y  £  N  \  {0}}. 

Obviously  this  is  a  very  cumbersome  notation  so  for  a  relation  R  on  a  set  S  we  write 

x  R  y 

if  (x,  y )  G  R,  i.e.  if  we  now  write  <  for  LE  we  obtain  the  usual  notation  1  <  2  etc.  Relations  which 
are  of  interest  in  mathematics  usually  satisfy  one  or  more  of  the  following  four  properties. 
Definition  100.4  (Properties  of  Relations). 

•  A  relation  R  on  a  set  S  is  reflexive  if  for  all  x  G  S  we  have  (x,x)  G  R. 

•  A  relation  R  on  a  set  S  is  symmetric  if  (x,  y)  G  R  implies  that  (y,  x)  G  R. 

•  A  relation  R  on  a  set  S  is  anti- symmetric  if  (x,y)  G  R  and  (y,x)  G  R  implies  that  x  —  y. 

•  A  relation  R  on  a  set  S  is  transitive  if  (x,  y)  G  R  and  (y,  z)  G  R  implies  that  (x,  z)  G  R. 

We  return  to  our  example  of  <.  This  relation  <  is  certainly  reflexive  as  x  <  x  for  all  x  G  N.  It 
is  not  symmetric  as  x  <  y  does  not  imply  that  y  <  x,  however  it  is  anti-symmetric  as  x  <  y  and 
y  <  x  imply  that  x  =  y.  You  should  note  that  it  is  transitive  as  well. 

Relations  like  <  occur  so  frequently  that  we  give  them  a  name. 

Definition  100.5  (Partial  Order  Relation).  A  relation  which  is  a  partial  order  relation  if  it  is 
reflexive ,  transitive  and  anti- symmetric. 

Definition  100.6  (Total  Order  Relation).  A  relation  which  is  transitive  and  anti- symmetric  and 
for  which  for  all  x  and  y,  with  x  j -  y,  we  have  either  (x,  y)  G  R  or  (y,  x)  G  R  is  called  a  total  order 
relation. 

Whilst  every  total  order  relation  is  a  partial  order  relation,  the  converse  is  not  true.  For  example 
consider  the  relation  of 

div  =  {(x,  y)  :  x,  y  G  N,  x  divides  y}. 

This  is  clearly  a  partial  ordering,  since 

•  It  is  reflexive,  as  x  divides  x. 

•  It  is  transitive,  as  x  divides  y  and  y  divides  z  implies  x  divides  z. 

•  It  is  anti-symmetric,  as  if  x  divides  y  and  y  divides  x  then  x  —  y. 

But  it  is  clearly  not  a  total  order  as  3  does  not  divide  4  and  4  does  not  divide  3. 

Another  important  type  of  relationship  is  that  of  an  equivalence  relation. 

Definition  100.7  (Equivalence  Relation).  A  relation  which  is  reflexive ,  symmetric  and  transitive 
is  called  an  equivalence  relation. 

The  obvious  example  of  N  and  the  relation  “is  equal  to”  is  an  equivalence  relation  and  hence  gives 
this  type  of  relation  its  name.  One  of  the  major  problems  in  any  science  is  that  of  classification  of 
sets  of  objects.  This  amounts  to  placing  the  objects  into  mutually  disjoint  subsets.  An  equivalence 
relation  allows  us  to  place  elements  into  disjoint  subsets.  Each  of  these  subsets  is  called  an  equiva¬ 
lence  class.  If  the  properties  we  are  interested  in  are  constant  over  each  equivalence  class  then  we 
may  as  well  restrict  our  attention  to  the  equivalence  classes  themselves.  This  often  leads  to  greater 
understanding.  In  the  jargon  this  process  is  called  factoring  out  by  the  equivalence  relation.  It 
occurs  frequently  in  algebra  to  define  new  objects  from  old,  e.g.  quotient  groups.  The  following 
example  is  probably  the  most  familiar;  being  a  description  of  modular  arithmetic. 

Let  m  be  a  fixed  positive  integer.  Consider  the  equivalence  relation  on  Z  which  says  x  is  related 
to  y  if  (x  —  y)  is  divisible  by  m.  This  is  an  equivalence  relation,  which  you  should  check.  The 
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equivalence  classes  we  denote  by 

0  =  ,  —2  •  m,  — ra,  0,  ra,  2  •  ra, . . .}, 

1  =  ,  —2  •  m  +  1,  —  ra  +  1, 1,  ra  +  1,  2  •  ra  +  1,...}, 

m  —  1  =  {. . . ,  — ra  —  1,  —1,  m  —  1.2  -  m  —  1, 3  •  m  —  1,...}. 

Note  that  there  are  ra  distinct  equivalence  classes,  one  for  each  of  the  possible  remainders  on  division 
by  ra.  The  classes  are  often  called  the  residue  classes  modulo  ra.  The  resulting  set  {0, . . . ,  m  —  1} 
is  often  denoted  by  Z/raZ  as  we  have  divided  out  by  all  multiples  of  ra.  If  ra  is  a  prime  number, 
say  p,  then  the  resulting  set  is  often  denoted  ¥p  as  the  resulting  object  is  a  held. 

A. 3.  Functions 

We  give  two  definitions  of  functions;  the  first  is  wordy  and  is  easier  to  get  hold  of,  the  second  is 
set-theoretic. 

Definition  100.8  (Function  -  vl).  A  function  is  a  rule  which  maps  the  elements  of  one  set,  the 
domain,  to  those  of  another,  the  codomain.  Each  element  in  the  domain  must  map  to  one  and  only 
one  element  in  the  codomain  (a.k.a.  the  range  of  the  function) . 

The  point  here  is  that  the  function  is  not  just  the  rule,  e.g.  f(x)  =  x 2,  but  also  the  two  sets  that 
one  is  using.  A  few  examples  will  suffice. 

(1)  The  rule  f(x)  =  yT  is  not  a  function  from  M  to  M  since  the  square  root  of  a  negative 
number  is  not  in  M.  It  is  also  not  a  function  (depending  on  how  you  define  the  symbol) 
from  M>o  to  M  since  every  element  of  the  domain  has  two  square  roots  in  the  codomain. 
But  it  is  a  function  from  M>o  to  M>q. 

(2)  The  rule  f[x)  —  1/x  is  not  a  function  from  1  to  1  but  it  is  a  function  from  M  \  {0}  to  M. 

(3)  Note  that  not  every  element  of  the  codomain  need  have  an  element  mapping  to  it.  Hence, 
the  rule  f(x)  =  x 2  taking  elements  of  M  to  elements  of  M  is  a  function. 

Our  definition  of  a  function  is  unsatisfactory  as  it  would  also  require  a  definition  of  what  a  rule  is. 
In  keeping  with  the  spirit  of  everything  else  we  have  done  we  give  a  set-theoretic  description. 

Definition  100.9  (Function  -  v2).  A  function  from  the  set  A  to  the  set  B  is  a  subset  F  of  A  x  B 
such  that: 

(1)  If  (x,  y)  G  F  and  (x,  z)  G  F  then  y  =  z. 

(2)  For  all  x  G  A  there  exists  a  y  E  B  such  that  (x,y)  e  F. 

The  set  A  is  called  the  domain,  the  set  B  the  codomain.  The  first  condition  means  that  each 
element  in  the  domain  maps  to  at  most  one  element  in  the  codomain.  The  second  condition  means 
that  each  element  of  the  domain  maps  to  at  least  one  element  in  the  codomain.  Given  a  function 
/  from  A  to  B  and  an  element  x  of  A  then  we  denote  by  f(x)  the  unique  element  in  B  such  that 
(x,f(x))  e  f. 

Composition  of  Functions:  One  can  compose  functions,  if  the  definitions  make  sense.  Say  one 
has  a  function  /  from  A  to  B  and  a  function  g  from  B  to  C ,  then  the  function  g  o  /  is  the  function 
with  domain  A  and  codomain  C  consisting  of  the  elements  (x,g(f{x))). 

Lemma  100.10.  Let  f  be  a  function  from  A  to  B,  let  g  be  a  function  from  B  to  C  and  let  h  be  a 
function  from  C  to  D,  then  we  have 

ho  (go  f)  =  (hog)o  f. 

Proof.  Let  (a,  d)  belong  to  (h  o  g)  o  /.  Then  there  exists  an  (a,  b)  E  /  and  a  ( b ,  d)  E  {h  o  g)  for 
some  b  E  B,  by  definition  of  composition  of  functions.  Again  by  definition  there  exists  a  c  G  C 
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such  that  (b,  c)  G  g  and  (c,  d)  G  h.  Hence  (a,  c)  G  (g  o  /),  which  shows  (a,  d)  G  h  o  (g  o  /).  Hence 

{ho  g)  o  f  C  ho  (g  o  f). 

Similarly  one  can  show  the  other  inclusion.  □ 


One  function,  the  identity  function,  is  particularly  important. 

Definition  100.11  (Identity  Function).  The  identity  function  id^  from  a  set  A  to  itself,  is  the  set 
{(x, x)  :  x  G  A}. 

Lemma  100.12.  For  any  function  f  from  A  to  B  we  have 

foidA  =  idB°f  =  f. 


Proof.  Let  x  be  an  element  of  A ,  then 

(/  °  id  A){x)  =  f{idA{x))  =  f(x)  =  id  B{f{x))  =  (idB  o  f)(x). 


□ 


Injective,  Surjective  and  Bijective  Functions:  Two  properties  that  we  shall  use  all  the  time 
are  the  following. 

Definition  100.13  (Injective  and  Surjective). 

A  function  f  from  A  to  B  is  said  to  be  injective  (or  1:1)  if  for  any  two  elements,  x,  y  of  A  with 
f(x)  =  f(y)  we  have  x  =  y. 

A  function  f  from  A  to  B  is  said  to  be  surjective  (or  onto)  if  for  every  element  b  G  B  there  exists 
an  element  a  G  A  such  that  f(a)  =  b. 

A  function  which  is  both  injective  and  surjective  is  called  bijective  (or  a  1:1  correspondence).  We 
shall  now  give  some  examples. 

(1)  The  function  from  1  to  1  given  by  /(x)  =  x  +  2  is  bijective. 

(2)  The  function  from  N  to  N  given  by  /(x)  =  x  +  2  is  injective  but  not  surjective  as  the 
elements  {0, 1}  are  not  the  image  of  anything. 

(3)  The  function  from  M  to  M>o  given  by  /(x)  =  x2  is  surjective  as  every  non- negative  real 
number  has  a  square  root  in  M  but  it  is  not  injective  as  if  x2  =  y2  then  we  could  have 
x  =  -y. 

The  following  gives  us  a  good  reason  to  study  bijective  functions. 

Lemma  100.14.  A  function  f  :  A  -A  B  is  bijective  if  and  only  if  there  exists  a  function  g  :  B  -A  A 
such  that  fog  and  g  o  f  are  the  identity  function. 

We  leave  the  proof  of  this  lemma  as  an  exercise.  Note  that  applying  this  lemma  to  the  resulting  g 
means  that  g  is  also  bijective.  Such  a  function  as  g  in  the  above  lemma  is  called  the  inverse  of  / 
and  is  usually  denoted  f~1.  Note  that  a  function  only  has  an  inverse  if  it  is  bijective. 


A. 4.  Permutations 

We  let  A  be  a  finite  set  of  cardinality  n;  without  loss  of  generality  we  can  assume  that  A  = 
{1,2,...,  77,}.  A  bijective  function  from  A  to  A  is  called  a  permutation.  The  set  of  all  permutations 
on  a  set  of  cardinality  n  is  denoted  by  Sn. 

Suppose  A  =  {1,  2,3},  then  we  have  the  permutation  /(l)  =  2,  /( 2)  =  3  and  /( 3)  =  1.  This  is 
a  very  cumbersome  way  to  write  a  permutation.  Mathematicians  (being  lazy  people)  have  invented 
the  following  notation:  the  function  /  above  is  written  as 


1  2  3 

2  3  1 
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What  should  be  noted  about  this  notation  (which  applies  for  arbitrary  n)  is  that  all  the  numbers 
between  1  and  n  occur  exactly  once  on  each  row.  The  first  row  is  always  given  as  the  numbers 
1  to  n  in  increasing  order.  Any  such  matrix  with  these  properties  represents  a  permutation,  and 
all  permutations  can  be  represented  by  such  a  matrix.  This  leads  us  to  the  following  elementary 
result. 

Lemma  100.15.  The  cardinality  of  the  set  Sn  is  n\. 

Proof.  This  is  a  well-known  argument.  There  are  n  choices  for  the  first  element  in  the  second 
row  of  the  above  matrix.  Then  there  are  n  —  1  choices  for  the  second  element  in  the  second  row 
and  so  on.  □ 


If  a  is  a  permutation  on  a  set  S  then  we  usually  think  of  a  acting  on  the  set.  So  if  5  G  S  then  we 
write  sa  or  <7(5)  for  the  action  of  a  on  the  element  s. 

Suppose  we  define  the  permutations 


9 

f 


f  12  3 
y  2  3  1 

/  1  2  3 
(321 


1 


As  permutations  are  nothing  but  functions  we  can  compose  them.  Remembering  that  g  o  /  means 
apply  the  function  /  and  then  apply  the  function  g  we  see  that 


/  1  2  3 
(2  3  1 


1  2  3  \ 
3  2  1  ) 


means  1— >>3— >>1,2— >>2— >>3  and  3  -A  1  -A  2.  Hence,  the  result  of  composing  the  above  two 
permutations  is 


However,  this  can  cause  confusion  when  using  our  “acting  on  a  set”  notation  above.  For  example 

19o/  =  5(/(1))  =  3 


so  we  are  unable  to  read  the  permutation  from  left  to  right.  However,  if  we  use  another  notation, 
say  •,  to  mean 

f -9  =  9°  f 

then  we  are  able  to  read  the  expression  from  left  to  right.  We  shall  call  this  operation  multiplying 
permutations. 


Cycle  Notation:  Mathematicians,  as  we  said,  are  by  nature  lazy  people  and  this  notation  we 
have  introduced  is  still  a  little  too  much.  For  instance  we  always  write  down  the  numbers  1, . . .  ,n 
in  the  top  row  of  each  matrix  to  represent  a  permutation.  Also  some  columns  are  redundant,  for 
instance  the  first  column  of  the  permutation  in  equation  (24).  We  now  introduce  another  notation 
for  permutations  which  is  concise  and  clear.  We  first  need  to  define  what  a  cycle  is. 

Definition  100.16  (Cycle).  By  a  cycle  or  n-cycle  we  mean  the  object  (aq, . . .  ,xn)  with  distinct 
X{  G  N\  {0}.  This  represents  the  permutation  /(aq)  =  X2  ,  f(x 2)  =  X3  , . . . ,  f(xn- 1)  =  xn,  f(xn)  = 
x\  and  for  x  0  {aq, . . . ,  xn}  we  have  f{x)  =  x. 

For  instance  we  have 

j  =(1,2,  3)  =  (2,3,1)  =  (3,1,  2). 


/  1  2  3 
(231 
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Notice  that  a  cycle  is  not  a  unique  way  of  representing  a  permutation.  Most  permutations  cannot 
be  written  as  a  single  cycle,  but  they  can  be  written  as  a  product  of  cycles.  For  example  we  have 

3  2  i  j  =(1.3)  o(2)  =  (3,1)  o(2). 

The  identity  permutation  is  represented  by  ().  Again,  as  mathematicians  are  lazy  we  always  write 
(1,3)  o  (2)  =  (1,3).  This  can  lead  to  ambiguities  as  (1,2)  could  represent  a  function  from 

{1,2}  to  {1,2} 

or 

{1,  2, . . . ,  n}  to  {1,  2, . . . ,  n}. 

However,  which  function  it  represents  is  usually  clear  from  the  context. 

Two  cycles  (aq, . . . ,  xn)  and  (yi, . . . ,  yn)  are  called  disjoint  if  {aq, . . . ,  xn}  n  {a/i , . . . ,  yn}  =  0.  It 
is  easy  to  show  that  if  a  and  r  are  two  disjoint  cycles  then 


a  •  r  =  r  •  a. 

Note  that  this  is  not  true  for  cycles  which  are  not  disjoint,  e.g. 

(1,2, 3, 4)  •  (3,5)  =  (1,2, 5, 3, 4)  ^  (1,2, 3, 5, 4)  =  (3,5)  •  (1,2, 3, 4). 

Our  action  of  permutations  on  the  underlying  set  can  now  be  read  easily  from  left  to  right, 

2(1  ’2,3,4)  -  (3,5)  =  3(3,5)  =  5  =  2(1.2,5,3,4) 

as  the  permutation  (1,  2,  3, 4)  maps  2  to  3  and  the  permutation  (3,  5)  maps  3  to  5. 

What  really  makes  disjoint  cycles  interesting  is  the  following. 

Lemma  100.17.  Every  permutation  can  be  written  as  a  product  of  disjoint  cycles. 

Proof.  Let  a  be  a  permutation  on  {1, . . . ,  n}.  Let  g\  denote  the  cycle 

(1,ct(1),ct(ct(1)),  ..a(  1) . . .)), 

where  we  keep  applying  a  until  we  get  back  to  1.  We  then  take  an  element  x  of  {1, . . .  ,n}  such 
that  ex i  (x)  =  x,  if  one  exists,  and  consider  the  cycle  a 2  given  by 

(x,  a(x ),  a(cr(x)), . . . ,  <r(. . .  a(x)  . . .)). 

We  then  take  an  element  of  {1, . . . ,  nj  which  is  fixed  by  u\  and  02  to  create  a  cycle  <73.  We  continue 
this  way  until  we  have  used  all  elements  of  {1, . . . ,  n}.  The  resulting  cycles  or, . . . ,  at  are  obviously 
disjoint  and  their  product  is  equal  to  the  cycle  a.  □ 

What  is  nice  about  this  proof  is  that  it  is  constructive.  Given  a  permutation  we  can  follow  the 
procedure  in  the  proof  to  obtain  the  permutation  as  a  product  of  disjoint  cycles.  Consider  the 
permutation 

123456789 
237684159 

We  have  a(  1)  =  2,  cr(2)  =  3,  a (3)  =  7  and  cr(7)  =  1  so  the  first  cycle  is 

CTi  =  (1,2, 3, 7). 

The  next  element  of  {1, . . . ,  9}  which  we  have  not  yet  considered  is  4.  We  have  <r( 4)  =  6  and 
cr(6)  =  4  so  (72  =  (4,  6).  Continuing  in  this  way  we  find  <73  =  (5,  8)  and  <74  =  (9).  Hence  we  have 


a  =  (1, 2, 3,  7) (4, 6) (5, 8)(9)  =  (1, 2,  3,  7)(4, 6)(5, 8). 
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A. 5.  Operations 

In  mathematics  one  meets  lots  of  binary  operations:  ordinary  addition  and  multiplication,  com¬ 
position  of  functions,  matrix  addition  and  multiplication,  multiplication  of  permutations,  etc.,  the 
list  is  somewhat  endless.  All  of  these  binary  operations  have  a  lot  in  common;  they  also  have  many 
differences,  for  instance,  for  two  real  numbers  x  and  y  we  have  x  •  y  =  y  •  x,  but  for  two  2x2 
matrices  with  real  entries,  A  and  B ,  it  is  not  true  that  we  always  have  A  •  B  =  B  •  A.  To  study 
the  similarities  and  differences  between  these  operations  we  formalize  the  concept  below.  We  then 
prove  some  results  which  are  true  of  operations  given  some  basic  properties,  these  results  can  then 
be  applied  to  any  of  the  operations  above  which  satisfy  the  given  properties.  Hence  our  abstraction 
will  allow  us  to  prove  results  in  many  areas  at  once. 

Definition  100.18  (Operation).  A  (binary)  operation  on  a  set  A  is  a  function  from  the  domain 
Ax  A  to  the  codomain  A. 

So  if  A  =  M  we  could  have  the  function  /(x,  y)  =  x  +  y.  Writing  /(x,  y)  all  the  time  can  become  a 
pain  so  we  often  write  a  symbol  between  the  x  and  the  y  to  denote  the  operation,  e.g. 


x  •  y 

x  Ay 

x  ©  y 

X  o  y 

x  Q  y 

x  o  y 

x  Ay 

x  V  y 

xxy. 

Most  often  we  write  x  +  y  and  x  •  y\  we  refer  to  the  former  as  additive  notation  and  the  latter  as 
multiplicative  notation.  One  should  bear  in  mind  that  we  may  not  be  actually  referring  to  ordinary 
multiplication  and  addition  when  we  use  these  terms/notations. 

Associative  and  Commutative:  Operations  can  satisfy  various  properties. 

Definition  100.19  (Associative).  An  operation  o  is  said  to  be  associative  if  for  all  x,  y  and  z  we 
have 

(x  o  y)  o  z  =  x  o  (y  o  z) . 

Operations  which  are  associative  include  all  the  examples  mentioned  above.  Non-associative  oper¬ 
ations  do  exist  (for  example  the  subtraction  operation  on  the  integers  is  non-associative)  but  we 
shall  not  be  interested  in  them  much.  Note  that  for  an  associative  operation  the  expression 

w  o  x  o  y  o  z 

is  well  defined;  as  long  as  we  do  not  change  the  relative  position  of  any  of  the  terms  it  does  not 
matter  which  operation  we  carry  out  first. 

Definition  100.20  (Commutative).  An  operation  V  is  said  to  be  commutative  if  for  all  x  and  y 
we  have 

x  V  y  =  y  V  x. 

Ordinary  addition,  multiplication  and  matrix  addition  are  commutative,  but  multiplication  of  ma¬ 
trices  and  permutations  are  not. 

Identities: 

Definition  100.21  (Identity).  An  operation  •  on  the  set  A  is  said  to  have  an  identity  if  there  exists 
an  element  e  of  A  such  that  for  all  x  we  have 

p  •  nr  =  nr  •  p  =  nr 

V  tXj  tXj  ' — '  tXj  • 

The  first  thing  we  notice  is  that  all  the  example  operations  above  possess  an  identity,  but  ordinary 
subtraction  on  the  set  M  does  not  possess  an  identity.  The  following  shows  that  there  can  be  at 
most  one  identity  for  any  given  operation. 

Lemma  100.22.  If  an  identity  exists  then  it  is  unique.  It  is  then  called  “the”  identity. 
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Proof.  Suppose  there  are  two  identities  e  and  e' .  As  e  is  an  identity  we  have  e  •  e!  =  e'  and  as  e' 
is  an  identity  we  have  e  •  e'  =  e.  Hence,  we  have  e!  =  e  •  e!  =  e.  □ 

Usually  if  we  are  using  an  additive  notation  then  we  denote  the  identity  by  0  to  correspond  with 
the  identity  for  ordinary  addition,  and  if  we  are  using  the  multiplicative  notation  then  we  denote 
the  identity  by  either  1  or  e. 

Inverses: 

Definition  100.23  (Inverses).  Let  +  be  an  operation  on  a  set  A  with  identity  0.  Let  x  G  A.  If 
there  is  a  y  G  A  such  that 

xJry  =  y  +  x  =  0 

then  we  call  y  an  inverse  of  x. 

In  the  additive  notation  it  is  usual  to  write  the  inverse  of  x  as  —x.  In  the  multiplicative  notation 
it  is  usual  to  write  the  inverse  as  x~l . 

All  elements  in  M  have  inverses  with  respect  to  ordinary  addition.  All  elements  in  M  except 
zero  have  inverses  with  respect  to  ordinary  multiplication.  Every  permutation  has  an  inverse  with 
respect  to  multiplication  of  permutations.  However,  only  square  matrices  of  non-zero  determinant 
have  inverses  with  respect  to  matrix  multiplication.  The  next  result  shows  that  an  element  can 
have  at  most  one  inverse  assuming  the  operation  is  associative. 

Lemma  100.24.  Consider  an  associative  operation  on  a  set  A  with  identity  e.  Let  x  G  A  have  an 
inverse  y,  then  this  inverse  is  unique,  we  call  it  “the”  inverse. 

Proof.  Suppose  there  are  two  such  inverses  y  and  yr ,  then 

V  =  V  ■  e  =  y  ■  (x  ■  y')  =  (y  ■  x)  ■  y'  =  e  ■  y'  =  y' . 

Note  how  we  used  the  associativity  property  above.  □ 

Lemma  100.25.  Consider  an  associative  operation  on  a  set  A  with  an  identity  e.  If  a,  b,x  G  A 
with  a  •  x  =  b  •  x  then  a  =  b. 

Proof.  Let  y  denote  the  inverse  of  x,  then  we  have 

a  =  a  •  e  =  a  •  {x  •  y)  =  (a  •  x)  •  y  =  (b  •  x)  •  y  =  b  •  (x  •  y)  =  b  •  e  =  b. 

□ 


We  shall  assume  from  now  on  that  all  operations  we  shall  encounter  are  associative. 

Powers:  Say  one  wishes  to  perform  the  same  operation  over  and  over  again,  for  example 

xWxWxW-’WxWx. 

If  our  operation  is  written  additively  then  we  write  for  n  G  N,  n  •  x  for  £  +  •••  +  #,  whilst  if  our 
operation  is  written  multiplicatively  we  write  xn  for  x  •  •  •  x.  The  following  result  can  then  be  proved 
by  induction. 

Lemma  100.26  (Law  of  Powers).  For  any  operation  o  which  is  associative  we  have 

gm  o  gn  =  £m+n,  ( gm)n  =  gm'n . 

We  can  extend  the  notation  to  all  n  G  Z  if  x  has  an  inverse  (and  the  operation  an  identity),  by 
(— n)  •  x  =  n  •  (—x)  and  x~n  =  (T-1)n.  The  following  lemma  is  obvious,  but  often  causes  problems 
as  it  is  slightly  counter-intuitive.  To  get  it  in  your  brain  consider  the  case  of  matrices. 

Lemma  100.27.  Consider  a  set  with  an  associative  operation  which  has  an  identity,  e.  If  x,y  G  G 
possess  inverses  then  we  have 

(1)  ( x~1)~ 1  =  x. 

(2)  (x  ■  y)-1  =  JT1  •  x-1. 
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Proof.  For  the  first  we  notice 


rp  1  .  np  -  p  — —  np  .  np  ^ 

tAy  tAy  tAy  tAy 


Hence  by  definition  of  inverses  the  result  follows.  For  the  second  we  have 

x-y  (y_1  ■  x~l)  =  x-(y  y _1)  •  x 
and  again  the  result  follows  by  the  definition  of  inverses. 


l  ™  „  -l  _  -l 


nr*  .  p  .  nr* 

tAy  O  tAy 


nr*  .  nr* 

tAy  tAy 


□ 


Additive 

Multiplicative 

x  +  y 

x  •  y 

0 

1  or  e 

-i 

—x 

X 

n  •  x 

xn 

A. 6.  Groups 

Definition  100.28  (Group).  A  group  is  a  set  G  with  a  binary  operation  o  such  that 

(1)  o  is  associative. 

(2)  o  has  an  identity  element  in  G. 

(3)  Every  element  of  G  has  an  inverse. 

Note  that  we  have  not  said  that  the  binary  operation  is  closed  as  this  is  implicit  in  our  definition  of 
what  an  operation  is.  If  the  operation  is  also  commutative  then  we  say  that  we  have  a  commutative, 
or  abelian,  group.  The  following  are  all  groups;  as  an  exercise  you  should  decide  on  the  identity 
element,  what  the  inverse  of  each  element  is,  and  which  groups  are  abelian. 

(1)  The  integers  Z  under  addition  (written  Z+). 

(2)  The  rationals  Q  under  addition  (written  Q+). 

(3)  The  reals  M  under  addition  (written  M+). 

(4)  The  complex  numbers  C  under  addition  (written  C+). 

(5)  The  rationals  (excluding  zero)  Q  \  {0}  under  multiplication  (written  Q*). 

(6)  The  reals  (excluding  zero)  M  \  {0}  under  multiplication  (written  M*). 

(7)  The  complex  numbers  (excluding  zero)  C  \  {0}  under  multiplication  (written  C*). 

(8)  The  set  of  n-ary  vectors  over  Z,  Q, . . . ,  etc.  under  vector  addition. 

(9)  The  set  of  n  x  m  matrices  with  integer,  rational,  real  or  complex  entries  under  matrix 
addition.  This  set  is  written  MnXm(Z),  etc.  however  when  m  =  n  we  write  Mn{ Z)  instead 
of  Mnxn(Z). 

(10)  The  general  linear  group  (the  matrices  of  non-zero  determinant)  over  the  rationals,  reals 
or  complex  numbers  under  matrix  multiplication  (written  GLn(Q),  etc.). 

(11)  The  special  linear  group  (the  matrices  of  determinant  ±1)  over  the  integers,  rationals  etc. 
(written  SLn(Z),  etc.). 

(12)  The  set  of  permutations  on  n  elements,  written  Sn  and  often  called  the  symmetric  group 
on  n  letters. 

(13)  The  set  of  continuous  (differentiable)  functions  from  1  to  1  under  pointwise  addition. 

The  list  is  endless;  a  group  is  one  of  the  most  basic  concepts  in  mathematics.  However,  not  all 
mathematical  objects  are  groups.  Consider  the  following  list  of  sets  and  operations  which  are  not 
groups,  you  should  also  decide  why. 

(1)  The  natural  numbers  N  under  ordinary  addition  or  multiplication. 

(2)  The  integers  Z  under  subtraction  or  multiplication. 

We  now  give  a  number  of  definitions  related  to  groups. 

Definition  100.29  (Orders). 
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The  order  of  a  group  is  the  number  of  elements  in  the  underlying  set  G  and  is  denoted  \G\  or  jfG. 
The  order  of  a  group  can  be  infinite. 

The  order  of  an  element  g  E  G  is  the  least  positive  integer  n  such  that  gn  =  e,  if  such  an  n  exists; 
otherwise  we  say  that  g  has  infinite  order. 

Definition  100.30  (Cyclic  Groups  and  Generators). 

A  cyclic  group  G  is  a  group  which  has  an  element  g  such  that  each  element  of  G  can  be  written 
in  the  form  gn  for  some  n  E  Z  (in  multiplicative  notation).  If  this  is  the  case  then  one  can  write 
G  =  (g)  and  one  says  that  g  is  a  generator  of  the  group  G. 

Note  that  the  only  element  in  a  group  with  order  one  is  the  identity  element  and  if  x  is  an  element 
of  a  group  then  x  and  x_1  have  the  same  order. 

Lemma  100.31.  If  G  =  (g)  and  g  has  finite  order  n  then  the  order  of  G  is  n. 

Proof.  Every  element  of  G  can  be  written  as  g 771  for  some  m  E  Z,  but  as  g  has  order  n  there  are 

only  n  distinct  such  values,  as 

77  1  77 

9  =9  °  9  =  eo  g  =  g. 

So  the  group  G  has  only  n  elements.  □ 

Let  us  relate  this  back  to  the  permutations  which  we  introduced  earlier.  Recall  that  the  set  of 
permutations  on  a  fixed  set  S  forms  a  group  under  composition.  It  is  easy  to  see  that  if  a  G  Sn  is 
a  k-cycle  then  a  has  order  k  in  Sn.  One  can  also  easily  see  that  if  a  is  a  product  of  disjoint  cycles 

then  the  order  of  a  is  the  least  common  multiple  of  the  orders  of  the  constituent  cycles. 

A  subset  S  of  G  is  said  to  generate  G  if  every  element  of  G  can  be  written  as  a  product  of 
elements  of  S.  For  instance 

•  the  group  S3  is  generated  by  the  set  {(1,2),  (1,2,  3)}, 

•  the  group  Z+  is  generated  by  the  element  1, 

•  the  group  Q*  is  generated  by  the  set  of  prime  numbers,  it  therefore  has  an  infinite  number 
of  generators. 

Note  that  the  order  of  a  group  says  nothing  about  the  number  of  generators  it  has,  although  the 
order  is  clearly  a  trivial  upper  bound  on  the  number  of  generators. 

An  important  set  of  finite  groups  which  are  easy  to  understand  is  groups  obtained  by  considering 
the  integers  modulo  a  number  m.  Recall  that  we  have  Z/mZ  =  {0, 1, ... ,  m  —  1}.  This  is  a  group 
with  respect  to  addition,  when  we  take  the  non-negative  remainder  after  forming  the  sum  of  two 
elements.  It  is  not  a  group  with  respect  to  multiplication  in  general,  even  when  we  exclude  0.  We 
can,  however,  get  around  this  by  setting 

(Z/mZ)*  =  {x  G  Z/mZ  :  gcd(m,x)  =  1}. 

This  latter  set  is  a  group  with  respect  to  multiplication,  when  we  take  the  non-negative  remainder 
after  forming  the  product  of  two  elements.  The  order  of  (Z/mZ)*  is  denoted  0(m),  the  Euler  <f> 
function.  This  is  an  important  function  in  the  theory  of  numbers.  As  an  example  we  have 

=p~1, 

if  p  is  a  prime  number.  We  shall  return  to  this  function  later. 

Subgroups:  We  now  turn  our  attention  to  subgroups. 

Definition  100.32  (Subgroup).  A  subgroup  H  of  a  group  G  is  a  subset  of  G  which  is  also  a  group 
with  respect  to  the  operation  of  G.  We  write  in  this  case  H  <  G.  A  subgroup  H  is  called  trivial  if 
it  is  equal  to  the  whole  group  G,  or  is  equal  to  the  group  consisting  of  just  the  identity  element. 
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Note  that  by  this  definition  GLn(M)  is  not  a  subgroup  of  Mn(R),  although  GLn(R)  C  Mn(R).  The 
operation  on  GLn(R)  is  matrix  multiplication  whilst  that  on  Mn(R)  is  matrix  addition.  However 
we  do  have  the  subgroup  chains: 

Z+  <  Q+  <  R+  <  C+, 

Q*  <  R*  <  C*. 

If  we  also  identify  x  E  Z  with  the  diagonal  matrix  diag(x, . . .  ,x)  then  we  also  have  that  Z+  is  a 
subgroup  of  Mn( Z)  and  so  on. 

As  an  important  example,  consider  the  set  2Z  of  even  integers,  which  is  a  subgroup  of  Z+.  If 
we  write  Z+  =  1Z,  then  we  have  nZ  <  mZ  if  and  only  if  m  divides  n,  where 

m Z  =  {...,  —2  •  m,  —  m,  0,  rri,  2  •  rn, . . .}. 

We  hence  obtain  various  chains  of  subgroups  of  Z+, 

18Z  <  6Z  <  2Z  <  Z+, 

18Z  <  9Z  <  3Z  <  Z+, 

18Z  <  6Z  <  3Z  <  Z+. 

We  now  show  that  these  are  the  only  such  subgroups  of  Z+. 

Lemma  100.33.  The  only  subgroups  of  Z+  are  nZ  for  some  positive  integer  n. 

Proof.  Let  H  be  a  subgroup  of  Z+.  As  H  is  non-empty  it  must  contain  an  element  x  and  its 
inverse  —x.  Hence  H  contains  at  least  one  positive  element  n.  Let  n  denote  the  least  such  positive 
element  of  H.  Hence  nZ  C  H . 

Now  let  m  denote  an  arbitrary  non-zero  element  of  H .  By  Euclidean  division,  there  exist 
g,r  G  Z  with  0  <  r  <  n  such  that 

m  =  q  •  n  +  r. 

Hence  r  G  H.  By  choice  of  n  this  must  mean  r  =  0,  since  H  is  a  group  under  addition.  Therefore 
all  elements  of  H  are  of  the  form  n  •  g,  for  some  value  of  cp  which  is  what  was  required.  □ 

So  every  subgroup  of  Z+  is  an  infinite  cyclic  group.  This  last  lemma  combined  with  the  earlier 
subgroup  chains  gives  us  a  good  definition  of  what  a  prime  number  is. 

Definition  100.34  (Prime  Number).  A  prime  number  is  a  (positive)  generator  of  a  non-trivial 
subgroup  H  of  Z+,  for  which  no  subgroup  e/Z+  contains  H  except  Z+  and  H  itself. 

What  is  good  about  this  definition  is  that  we  have  not  referred  to  the  multiplicative  structure  of 
Z  to  define  the  primes.  Also  it  is  obvious  that  neither  zero  nor  one  is  a  prime  number.  You  should 
convince  yourself  that  this  definition  leads  to  the  usual  definition  of  primes  in  terms  of  divisibility. 
In  addition  the  above  definition  allows  one  to  generalize  the  notion  of  primality  to  other  settings; 
for  how  this  is  done  consult  any  standard  textbook  on  abstract  algebra. 

Normal  Subgroups  and  Cosets:  A  normal  subgroup  is  particularly  important  in  the  theory  of 
groups.  The  name  should  not  be  thought  of  as  meaning  that  these  are  the  subgroups  that  normally 
arise;  the  name  is  a  historic  accident.  To  define  a  normal  subgroup  we  first  need  to  define  what  is 
meant  by  conjugate  elements. 

Definition  100.35  (Conjugate).  Two  elements  x,y  of  a  group  G  are  said  to  be  conjugate  if  there 
is  an  element  g  E  G  such  that  x  =  g~1  •  y  •  g. 

It  is  obvious  that  two  conjugate  elements  have  the  same  order.  As  an  exercise  you  should  show 
that  the  conjugates  in  a  group  form  an  equivalence  class  under  the  conjugate  relation.  If  TV  is  a 
subgroup  of  G  we  define,  for  any 

g~lNg  =  {g~l  •  x  •  g  :  x  G  iV}, 

which  is  another  subgroup  of  G,  called  a  conjugate  of  the  subgroup  N . 
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Definition  100.36  (Normal  Subgroup).  A  subgroup  N  <  G  is  said  to  be  a  normal  subgroup  if 
g~1Ng  C  N  for  all  g  G  G.  If  this  is  the  case  then  we  write  N  <\  G. 

For  any  group  G  we  have  G  <  G  and  {e}  <  G  and  if  G  is  an  abelian  group  then  every  subgroup  of 
G  is  normal.  The  importance  of  normal  subgroups  comes  from  the  fact  that  these  are  subgroups 
by  which  we  can  factor  out.  This  is  related  to  the  cosets  of  a  subgroup  which  we  now  go  on  to 
introduce. 

Definition  100.37  (Cosets).  Let  G  be  a  group  and  H  <  G  (H  is  not  necessarily  normal).  Fix  an 
element  g  G  G,  then  we  define  the  left  coset  of  H  with  respect  to  g  to  be  the  set 

gH  =  {g  •  h  :  h  G  H}. 

Similarly  we  define  the  right  coset  of  H  with  respect  to  g  to  be  the  set 

Hg  =  {h  •  g  :  h  G  H}. 

Let  H  denote  a  subgroup  of  G  then  one  can  show  that  the  set  of  all  left  (or  right)  cosets  of  H  in 
G  forms  a  partition  of  G,  but  we  leave  this  to  the  reader.  In  addition  if  a,  b  G  G  then  aH  =  bH  if 
and  only  if  a  G  bH ,  which  is  also  equivalent  to  b  G  aH,  a  fact  which  we  also  leave  to  the  reader  to 
show.  Note  that  we  can  have  two  equal  cosets  aH  =  bH  without  having  a  =  b. 

What  these  latter  facts  show  is  that  if  we  define  the  relation  Rjj  on  the  group  G  with  respect 
to  the  subgroup  H  by 

(a,  b)  G  Rh  if  and  only  if  a  =  b  •  h  for  some  h  G  H, 


then  this  relation  is  an  equivalence  relation.  The  equivalence  classes  are  just  the  left  cosets  of  H 
in  G. 

The  number  of  left  cosets  of  a  subgroup  H  in  G  is  denoted  by  (G  :  H)l ,  the  number  of  right 
cosets  is  denoted  by  (G  :  H)r.  We  are  now  in  a  position  to  prove  the  most  important  theorem  of 
elementary  group  theory,  namely  Lagrange’s  Theorem. 

Theorem  100.38  (Lagrange’s  Theorem).  Let  H  be  a  subgroup  of  a  finite  group  G  then 


(G  :  H)l  • 
(G  :  H)r  • 


H 

H 


Before  we  prove  this  result  we  state  some  obvious  important  corollaries. 

Corollary  100.39. 

•  We  have  (G  :  H)l  =  (G  :  H)r;  we  denote  this  common  number  by  (G  :  H)  and  call  it  the 
index  of  the  subgroup  H  in  G. 

•  The  order  of  a  subgroup  and  the  index  of  a  subgroup  both  divide  the  order  of  the  group. 

•  If  G  is  a  group  of  prime  order,  then  G  has  only  the  subgroups  G  and  (e) . 

We  now  return  to  the  proof  of  Lagrange’s  Theorem. 


Proof.  We  form  the  following  collection  of  distinct  left  cosets  of  H  in  G  which  we  define  induc¬ 
tively.  Put  gi  =  e  and  assume  we  are  given  i  cosets  by  g\H, . . . ,  giH.  Now  take  an  element  gi+ \  not 
lying  in  any  of  the  left  cosets  g3H  for  j  <  i.  After  a  finite  number  of  such  steps  we  have  exhausted 
the  elements  of  the  group  G.  So  we  have  a  disjoint  union  of  left  cosets  which  cover  the  whole  group. 

G=  (J  9iH. 

1  <i<(G:H)L 


We  also  have  for  each  i,j  that  \giH\  =  \gjH\,  this  follows  from  the  fact  that  the  map 


H  — »  gH 
h  i — >  g  •  h 
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is  a  bijective  map  on  sets.  Hence 

10=  E 

1  <i<(G:H)L 

The  other  equality  follows  using  the  same  argument. 


9iH\  =  (G:H)L\H 


□ 


We  can  also  deduce  from  the  corollaries  the  following. 

Lemma  100.40.  If  G  is  a  group  of  prime  order  then  it  is  cyclic. 

Proof.  If  g  G  G  is  not  the  identity  then  (g)  is  a  subgroup  of  G  of  order  >  2.  But  then  it  must 
have  order  \G\  and  so  G  is  cyclic.  □ 

We  can  use  Lagrange’s  Theorem  to  write  down  the  subgroups  of  some  small  groups.  For 
example,  consider  the  group  S3:  this  has  order  6  so  by  Lagrange’s  Theorem  its  subgroups  must 
have  order  1,  2,  3  or  6.  It  is  easy  to  see  that  the  only  subgroups  are  therefore: 

•  One  subgroup  of  order  1;  namely  ((1)), 

•  Three  subgroups  of  order  2;  namely  ((1,2)),  ((1,3))  and  ((2,3)), 

•  One  subgroup  of  order  3;  namely  ((1,  2,3)), 

•  One  subgroup  of  order  6,  which  is  S3  obviously. 


Factor  or  Quotient  Groups:  We  let  G  be  a  group  with  a  normal  subgroup  N.  The  following 
elementary  lemma,  whose  proof  we  again  leave  to  the  reader,  gives  us  our  justification  for  looking 
at  normal  subgroups. 

Lemma  100.41.  Let  H  G  then  the  following  ore  eguivolent : 

(1)  xH  =  Hx  for  all  x  G  G. 

(2)  x~1Hx  =  H  for  all  x  G  G. 

(3)  H  <G. 

(4)  x~1  •  h  •  x  G  H  for  all  x  G  G  and  h  G  H . 

By  G/N  we  denote  the  set  of  left  cosets  of  N ;  note  that  these  are  the  same  as  the  right  cosets  of 
N.  We  note  that  two  cosets,  g\N  and  g2N  are  equal  if  and  only  if  gf1g2  G  N . 

We  wish  to  turn  G/N  into  a  group,  the  so-called  factor  group  or  quotient  group.  Let  g\N  and 
g2N  denote  any  two  elements  of  G/N ,  then  we  define  the  product  of  their  left  cosets  to  be  (^1^2) N. 

We  first  need  to  show  that  this  is  a  well-defined  operation,  i.e.  if  we  replace  g\  by  g[  and  g2 
by  g'2  with  gf 1g[  —  n\  G  N  and  gf1  •  g'2  =  n2  £  N  then  our  product  still  gives  the  same  coset.  In 
other  words  we  wish  to  show 

(ffi  '  g2)N  =  (g[  ■  g'2)N. 

Now  let  x  G  (tj\  ■  g2)N,  then  x  =  g\  ■  g2  •  n  for  some  n  G  N.  Then  x  =  g[  ■  np  ■  g'2  ■  np  ■  n.  But  as 
G  is  normal  (left  cosets  =  right  cosets)  we  have  up1  ■  g2  =  <A  ' 11 -J,  for  some  713  G  N.  Hence 

x  =  g'i  ■  g'2-  n3  ■  np  ■  n  G  (g[  ■  g'2)N. 

This  proves  the  first  inclusion;  the  other  follows  similarly.  We  conclude  that  our  operation  on  G/N 
is  well  defined.  One  can  also  show  that  if  N  is  an  arbitrary  subgroup  of  G  and  we  define  the 
operation  on  the  cosets  above  then  this  is  only  a  well-defined  operation  if  N  is  a  normal  subgroup 
of  G. 

So  we  have  a  well-defined  operation  on  G/N ,  we  now  need  to  show  that  this  operation  satisfies 
the  axioms  of  a  group: 

•  As  an  identity  we  take  eN  =  TV,  since  for  all  g  G  G  we  have 

eN  •  gN  =  (e  •  g)N  =  gN. 

•  As  an  inverse  of  (gN)  we  take  g~lN  as 

gN  •  g-xN  =  (g  •  g-^N  =  eN  =  N. 
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•  Associativity  follows  from 

(giN)  ■  (, g2N  ■  g3N )  =  g3N  ■  ((g2  ■  g3)N)  =  (g3  ■  (g2  ■  g3))N 

=  ((fli  •  92)  ■  gz)N  =  ((ffi  •  g2)N)  ■  g3N 
=  {g\N  -g2N)  ■  ( g3N ). 

We  now  present  some  examples. 

(1)  Let  G  be  an  arbitrary  finite  group  of  order  greater  than  one;  let  H  be  a  subgroup  of  G. 
Then  H  =  G  and  H  =  {e}  are  always  normal  subgroups  of  G. 

(2)  If  H  =  G  then  there  is  only  one  coset  and  so  we  have  G/G  =  {G*}  is  a  group  of  order  one. 

(3)  If  H  =  {e}  then  the  cosets  of  H  are  the  one-element  subsets  of  G.  That  is  G/{e}  =  {{ g }  : 
g  €  G}. 

(4)  Put  G  =  S3  and  N  =  {(1),  (1,  2,  3),  (1,  3,  2)},  then  N  is  a  normal  subgroup  of  G.  The 
cosets  of  N  in  G  are  N  and  (1,2 )N  with 

((1,  2)N)2  =  (1,  2  fN  =  (1  )N  =  N. 

Hence  63/ ((  1,  2,  3))  is  a  cyclic  group  of  order  2. 

(5)  If  G  is  abelian  then  every  subgroup  H  of  G  is  normal,  so  one  can  always  form  the  quotient 
group  G/H. 

(6)  Since  (Z,  +)  is  abelian  we  have  that  mZ  is  always  a  normal  subgroup.  Forming  the  quotient 
group  Z/mZ  we  obtain  the  group  of  integers  modulo  nn  under  addition. 


Homomorphisms:  Let  G\  and  G2  be  two  groups;  we  wish  to  look  at  the  functions  from  G\  to 
G2 •  Obviously  we  could  look  at  all  such  functions,  however  by  doing  this  we  would  lose  all  the 
structure  that  the  group  laws  give  us.  We  restrict  ourselves  to  maps  which  preserve  these  group 
laws. 


Definition  100.42  (Homomorphism).  A  homomorphism  from  a  group  G\  to  a  group  G2  is  a 
function  f  with  domain  G\  and  codomain  G2  such  that  for  all  x,y  E  G\  we  have 

/( X  ■  y )  =  f(x)  ■  f(y). 

Note  that  multiplication  on  the  left  is  with  the  operation  of  the  group  G\  whilst  the  multiplication 
on  the  right  is  with  respect  to  the  operation  of  As  examples  we  have 

(1)  The  identity  map  id<^  :  G  -A  G*,  where  id c{g)  —  g  is  a  group  homomorphism. 

(2)  Consider  the  function  M+  -A  M*  given  by  f(x)  =  ex .  This  is  a  homomorphism  as  for  all 

x,  y  G  M  we  have 

ex+y  —  ex  ,  e y ' 

(3)  Consider  the  map  from  C*  to  M*  given  by  f(z)  =  \z\.  This  is  also  a  homomorphism. 

(4)  Consider  the  map  from  GLn(C)  to  C*  given  by  /(A)  =  det(A);  this  is  a  group  homomor¬ 

phism  as  det(A  •  B)  =  det(A)  •  det(H)  for  any  two  elements  of  GLn(C). 

Two  elementary  properties  of  homomorphisms  are  summarized  in  the  following  lemma. 

Lemma  100.43.  Let  f  :  G\  G2  be  a  homomorphism  of  groups,  then 

(1)  /(e 1)  =  e2. 

(2)  For  all  x  G  G\  we  have  f(x~1)  =  (/(a;))-1. 


Proof.  For  the  first  result  we  have  e2  •  f{x)  =  f{x)  =  f{e  1  •  x)  =  /(e  1)  •  /(x),  and  then  from 
Lemma  100.25  we  have  e2  =  f(e  1)  as  required.  For  the  second  result  we  have 

fO1)  ■  fO)  =  /  A1  ■  x)  =  f(e i)  =  e2, 

so  the  result  follows  by  definition.  □ 


For  any  homomorphism  /  from  G\  to  G2  there  are  two  special  subgroups  associated  with  /. 

Definition  100.44  (Kernel  and  Image). 
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•  The  kernel  of  f  is  the  set 

Kerf  =  {x  G  G\  :  /(x)  =  e2}. 

•  The  image  of  f  is  the  set 

Imf  =  {y  e  G2  ■■  y  =  fix),  x  e  Gi}. 

Lemma  100.45.  Kerf  is  a  normal  subgroup  of  G\. 

Proof.  We  first  show  that  it  is  a  subgroup.  It  is  certainly  non-empty  as  e\  G  Kerf  as  /(e i)  =  e2. 
Now  if  x  G  Kerf  then  /(x_1)  =  /(x)_1  =  e/1  =  e2,  hence  x_1  G  Kerf.  Hence  to  show  that  Kerf 
is  a  subgroup  we  only  have  to  show  that  for  all  x,y  G  Kerf  we  have  x  •  y~l  G  Kerf .  But  this  is 
easy  as  if  x,  y  G  Kerf  then  we  have 

fix  ■  y-1)  =  fix)  ■  fiy _1)  =  e2  •  e2  =  e2, 

and  we  are  done. 

We  now  show  that  Kerf  is  in  fact  a  normal  subgroup  of  G\.  We  need  to  show  that  if  x  G  Kerf 
then  g ~1  •  x  •  g  G  Ker/  for  all  g  G  Gi.  So  let  x  G  Ker/  and  let  <7  G  Gi,  then  we  have 

/0_1  •  x  ■  g)  =  fig-1)  ■  f{x)  ■  fig)  =  fig)-1  ■  e2  •  fig)  =  fig)-1  ■  fig)  =  e2, 
so  we  are  done.  □ 


Lemma  100.46.  Imf  is  a  subgroup  of  G2. 

Proof.  Imf  is  certainly  non-empty  as  /(e  1)  =  e2.  Now  suppose  y  G  Im/  so  there  is  an  x  G  G2 
such  that  /(x)  =  y,  then  ?/_1  =  /(x)-1  =  /(x-1)  and  x_1  G  Gi  so  ?/_1  G  Im/.  Now  suppose 
2/i ,  2/2  G  Im/,  hence  for  some  xi,x2  we  have 

yi  ■  yf1  =  fix  1)  •  fixp)  =  fix  1  •  xp). 

Hence  Im/  <  G2.  □ 


It  is  clear  that  Im/  in  some  sense  measures  whether  the  homomorphism  /  is  surjective  as  /  is 
surjective  if  and  only  if  Im/  =  G2.  Actually  the  set  G2/Im/  is  a  better  measure  of  the  surjectivity 
of  the  function.  On  the  other  hand,  Kerf  measures  how  far  from  injective  /  is,  due  to  the  following 
result. 

Lemma  100.47.  A  homomorphism ,  f,  is  injective  if  and  only  if  Kerf  =  {e\}. 


Proof.  Assume  /  is  injective,  then  we  know  that  if  /(x)  =  e2  =  /(e  1)  then  x  —  e\  and  so 
Kerf  =  {ei}.  Now  assume  that  Kerf  =  {ei }  and  let  x,y  G  G 1  be  such  that  /(x)  =  f(y).  Then 

fix  ■  y -1)  =  fix)  ■  fiy-1)  =  fix)  ■  fiy)-1  =  fiy)  ■  fiy)-1  =  e2. 


So  x  •  y 


G  Kerf,  but  then  x  •  y 


e\  and  so  x  =  y.  So  /  is  injective. 


□ 


Isomorphisms:  Bijective  homomorphisms  allow  us  to  categorize  groups  more  effectively,  as  the 
following  definition  elaborates. 

Definition  100.48  (Isomorphism).  A  homomorphism  f  is  said  to  be  an  isomorphism  if  it  is 
bijective.  Two  groups  are  said  to  be  isomorphic  if  there  is  an  isomorphism  between  them,  in  which 
case  we  write  G\  =  G2. 

Note  that  this  means  that  isomorphic  groups  have  the  same  number  of  elements.  Indeed  for  all 
intents  and  purposes  one  may  as  well  assume  that  isomorphic  groups  are  equal,  since  they  look  the 
same  up  to  relabelling  of  elements.  Isomorphisms  satisfy  the  following  properties. 

•  If  /  :  Gi  -A  G2  and  g  :  G2  -A  G3  are  isomorphisms  then  g  o  /  is  also  an  isomorphism,  i.e. 
isomorphisms  are  transitive. 

•  If  /  :  G\  -A  G2  is  an  isomorphism  then  so  is  /-1  :  G2  -A  Gi,  i.e.  isomorphisms  are 
symmetric. 
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•  The  identity  map  id  :  G  ^  G  given  by  id(x)  =  x  is  an  isomorphism,  i.e.  isomorphisms  are 
reflexive. 

From  this  we  see  that  the  relation  “is  isomorphic  to”  is  an  equivalence  relation  on  the  class  of  all 
groups.  This  justifies  our  notion  of  isomorphic  being  like  equal. 

Let  Gi,  G2  be  two  groups,  then  we  define  the  product  group  G\  x  G2  to  be  the  set  G\  x  G2 
of  ordered  pairs  (^1,^2)  with  gi  G  G\  and  g2  G  G2.  The  group  operation  on  G\  x  G2  is  given 
componentwise: 

(51,52)  0  (51,52)  =  (5i  °5i,52  052)- 

The  hrst  o  refers  to  the  group  G\  x  G2,  the  second  to  the  group  G\  and  the  third  to  the  group  G2. 
Some  well-known  groups  can  actually  be  represented  as  product  groups.  For  example,  consider  the 
map 

C+  — »  M+  x  M+ 
x  1 — »  (Re(x), Im(z)). 

This  map  is  obviously  a  bijective  homomorphism,  hence  we  have  C+  =  M+  x  M+. 

We  now  come  to  a  crucial  theorem  which  says  that  the  concept  of  a  quotient  group  is  virtually 
equivalent  to  the  concept  of  a  homomorphic  image. 

Theorem  100.49  (First  Isomorphism  Theorem  for  Groups).  Let  f  be  a  homomorphism  from  a 
group  G\  to  a  group  G2.  Then 

Gi/Ker/  =  Im/. 

The  proof  of  this  result  can  be  found  in  any  introductory  text  on  abstract  algebra.  Note  that 
Gi/Ker/  makes  sense  as  Ker /  is  a  normal  subgroup  of  G. 

A. 7.  Rings 

A  ring  is  an  additive  finite  abelian  group  with  an  extra  operation,  usually  denoted  by  multiplication, 
such  that  the  multiplication  operation  is  associative  and  has  an  identity  element.  The  addition  and 
multiplication  operations  are  linked  via  the  distributive  law, 

a  •  (b  +  c)  =  a  •  b  +  a  •  c. 

If  the  multiplication  operation  is  commutative  then  we  say  we  have  a  commutative  ring.  The 
following  are  examples  of  rings. 

•  Integers  under  addition  and  multiplication  of  integers. 

•  Polynomials  with  coefficients  in  Z,  denoted  Z[X],  under  polynomial  addition  and  multi¬ 
plication. 

•  Integers  modulo  a  number  m,  denoted  Z/mZ,  under  addition  and  multiplication  modulo 
m. 

Although  one  can  consider  subrings  they  turn  out  to  be  not  so  interesting.  Of  more  interest  are 
the  ideals  of  the  ring;  these  are  additive  subgroups  I  <  R  such  that 

i  G  I  and  r  G  R  implies  i  •  r  G  I. 

Examples  of  ideals  in  a  ring  are  the  principal  ideals  which  are  those  additive  subgroups  generated 
by  a  single  ring  element.  For  example  if  R  =  Z  then  the  principal  ideals  are  the  ideals  mZ,  for  each 
integer  m. 

Just  as  with  normal  subgroups  and  groups,  where  we  formed  the  quotient  group,  with  ideals 
and  rings  we  can  form  the  quotient  ring.  If  we  take  R  =  Z  and  I  =  mZ  for  some  integer  m  then 
the  quotient  ring  is  the  ring  Z/mZ  of  integers  modulo  m  under  addition  and  multiplication  modulo 
m.  This  leads  us  naturally  to  the  Chinese  Remainder  Theorem. 
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Theorem  100.50  (CRT).  Let  rn 

map  is  a  ring  isomorphism 

TLtmTL 

J  • 

x 


=  p^  •  •  -pp  be  the  prime  factorization  of  m,  then  the  following 

— x-  Z / pf^  Z  x  •  •  •  x  Z/ Pt/  ^ 

i — X  (x  (mod  pf1 ),...,  x  (mod  pp)). 


Proof.  This  can  be  proved  by  induction  on  the  number  of  prime  factors  of  m.  We  leave  the  details 
to  the  interested  reader.  □ 


We  shall  now  return  to  the  Euler  <f>  function  mentioned  earlier.  Remember  <f>{n)  denotes  the  order 
of  the  group  (fL/nl)*.  We  would  like  to  be  able  to  calculate  this  value  easily. 

Lemma  100.51.  Let  m  =  pzf  •  •  be  the  prime  factorization  of  m.  Then  we  have 

<t>(m )  =  Hp?  )  ■  ■■4>{pzt)- 


Proof.  This  follows  from  the  Chinese  Remainder  Theorem,  as  the  ring  isomorphism 

Z/mZ  =  TL/pffTh  x  •  •  •  x  Z/p^Z 

induces  a  group  isomorphism 

(Z/raZ)*  =  (Z/pf  Z)*  x  •  •  •  x  (Z/pf  Z)*. 


□ 


To  compute  the  Euler  0  function  all  we  now  require  is  the  following. 

Lemma  100.52.  Let  p  be  a  prime  number ,  then  <f>{j)e)  =  pe_1  •  (p  —  1). 

Proof.  There  are  pe  —  1  elements  of  Z  satisfying  1  <  k  <  pe;  of  these  we  must  eliminate  those  of 
the  form  k  =  r  •  p  for  some  r.  But  1  <  r  •  p  <  pe  implies  1  <  r  <  pe_1,  hence  there  are  pe_1  —  1 
possible  values  of  r.  So  we  obtain 

<t>(pe )  =  ( pe  - 1)  -  (pe_1  - 1) 

from  which  the  result  follows.  □ 

An  ideal  I  of  a  ring  is  called  prime  if  x  •  y  G  I  implies  either  x  G  I  or  y  G  I.  Notice  that  the 
ideals  I  =  mZ  of  the  ring  Z  are  prime  if  and  only  if  m  is  plus  or  minus  a  prime  number.  The  prime 
ideals  are  special  as  if  we  take  the  quotient  of  a  ring  by  a  prime  ideal  then  we  obtain  a  held.  Hence, 
Z/pZ  is  a  held.  This  brings  us  naturally  to  the  subject  of  helds. 

A. 8.  Fields 

A  held  is  essentially  two  abelian  groups  stuck  together  using  the  distributive  law. 

Definition  100.53  (Field).  A  field  is  an  additive  abelian  group  F,  such  that  F  \  {0}  also  forms 
an  abelian  group  with  respect  to  another  operation  (which  is  usually  written  multiplicatively).  The 
two  operations,  addition  and  multiplication,  are  linked  via  the  distributive  law: 

a  •  (b  +  c)  =  a  •  b  +  a  •  c  =  (b  +  c)  •  a. 

Many  helds  that  one  encounters  have  inhnitely  many  elements.  Every  held  either  contains  Q  as  a 
subheld,  in  which  case  we  say  it  has  characteristic  zero,  or  it  contains  as  a  subheld  in  which  case 
we  say  it  has  characteristic  p.  The  only  helds  with  hnitely  many  elements  have  pr  elements  when 
p  is  a  prime.  We  denote  such  helds  by  F pr]  for  each  value  of  r  there  is  only  one  such  held  up  to 
isomorphism.  Such  hnite  helds  are  often  called  Galois  helds. 

Let  F  be  a  held.  We  denote  by  F[X\  the  ring  of  polynomials  in  a  single  variable  X  with 
coefficients  in  the  held  F.  The  set  F(X)  of  rational  functions  in  X  is  the  set  of  functions  of  the 
form 


f(X)/g(X), 
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where  f(X),g(X)  £  F[X]  and  g(X)  is  not  the  zero  polynomial.  The  set  F(X )  is  a  held  with 
respect  to  the  obvious  addition  and  multiplication.  One  should  note  the  difference  in  the  notation 
of  the  brackets,  F[X]  and  F(X). 

Let  /  be  a  polynomial  of  degree  n  with  coefficients  in  Fp  which  is  irreducible.  Let  9  denote  a 
root  of  /.  Consider  the  set 

Fp(6')  =  {ao  +  a-i  •  6  +  •  •  •  +  cin—i  *  9n  :  cli  £  Fp). 

Given  two  elements  of  F p{9)  one  adds  them  componentwise  and  multiplies  them  as  polynomials  in 
0  but  then  one  takes  the  remainder  of  the  result  on  division  by  f(9).  The  set  F p{9)  is  a  field;  there 
are  field-theoretic  isomorphisms 

¥pn^Fp(0)^Fp[X]/(f), 
where  (/)  represents  the  ideal  {/  •  g  :  g  e  ¥P[X]}. 

Finite  Field  Example  1:  To  be  more  concrete  let  us  look  at  the  specific  example  given  by 
choosing  a  value  of  p  =  3  (mod  4)  and  f(X)  =  X 2  +  1.  Now  since  p  =  3  (mod  4)  the  polynomial 
/  is  irreducible  over  Fp[X]  and  so  the  quotient  Fp[X\/(f)  forms  a  field,  which  is  isomorphic  to  Fp2. 
Let  i  denote  a  root  of  the  polynomial  X 2  +  1.  The  field  Fp2  =  F p(i)  consists  of  numbers  of  the  form 
a  +  b  •  z,  where  a  and  b  are  integers  modulo  p.  We  add  such  numbers  as 

(u  4-  b  •  i)  T  (c  T  d  •  z)  =  (n  4~  c)  4~  (f)  T  d)  •  i. 

We  multiply  such  numbers  as 

(a  +  b  •  i)  •  (c  +  d  •  i)  =  (a  •  c  +  (a  •  d  +  b  •  c)  •  i  +  b  •  d  •  i2)  =  (a  •  c  —  b  •  d)  +  (a  •  d  +  b  •  c)  •  h 

Finite  Field  Example  2:  Let  #  denote  a  root  of  the  polynomial  x3  +  2,  then  an  element  of 

F73  =  F7(0)  can  be  represented  by 

a  b  ’  9  ~\~  c  '  9  , 

where  a,  6,  c  £  F7.  Multiplication  of  two  such  elements  gives 
(a  +  b  •  #  +  c  •  d2)  •  (a7  +  £/  •  9  +  F  •  d2)  =  a  •  a'  +  #  •  (a7  •  b  +  £/  •  a)  +  d2  •  (a  •  c7  +  b  •  £/  +  c  •  a') 

+  d3  •  (6  •  c  +  c  •  b')  +  c  •  c  •  d4 

=  (a  •  a7  —  2  •  b  •  c  —  2  •  c  •  £/)  +  $  •  [a  •  b  +  b'  •  a  —  2  •  c  •  c7) 

+  92  •  (a  •  c  +  b  •  £/  +  c  •  a7). 

A. 9.  Vector  Spaces 

Definition  100.54  (Vector  Space).  Given  a  field  K,  a  vector  space  (or  a  K -vector  space)  V 
is  an  abelian  group  ( also  denoted  V )  and  an  external  operation  •  :  K  x  V  V  ( called  scalar 

multiplication)  which  satisfies  the  following  axioms:  For  all  A,/x  £  K  and  all  x,  y  £  V  we  have 

(1)  A  •  (/x  •  x)  =  (A  •  p)  •  x. 

(2)  (A  +  fi)  •  x  =  A  •  x  +  /a  •  x. 

(3)  Ik  *  x  =  x. 

(4)  A  •  (x  +  y)  =  A  •  x  +  A  •  y . 

where  1  k  denotes  the  multiplicative  identity  of  K . 

One  often  calls  the  elements  of  V  the  vectors  and  the  elements  of  K  the  scalars.  Note  that  we  have 
not  defined  how  to  (or  whether  we  can)  multiply  or  divide  two  vectors.  With  a  general  vector  space 
we  are  not  interested  in  multiplying  or  dividing  vectors,  only  in  multiplying  them  with  scalars.  We 
shall  start  with  some  examples: 
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•  For  a  given  field  K  and  an  integer  n  >  1,  let  V  =  Kn  =  K  x  •  •  •  x  K  be  the  n-fold  Cartesian 
product.  This  is  a  vector  space  over  K  with  respect  to  the  usual  addition  of  vectors  and 
multiplication  by  scalars.  The  special  case  of  n  =  1  shows  that  any  held  is  a  vector  space 
over  itself.  When  K  =  M  and  n  —  2  we  obtain  the  familiar  system  of  geometric  vectors 
in  the  plane.  When  n  —  3  and  K  =  M  we  obtain  3-dimensional  vectors.  Hence  you  can 
already  see  the  power  of  vector  spaces  as  they  allow  us  to  consider  n-dimensional  space  in 
a  concrete  way. 

•  Let  K  be  a  held  and  consider  the  set  of  polynomials  over  K,  namely  K[X\.  This  is  a 
vector  space  with  respect  to  addition  of  polynomials  and  multiplication  by  elements  of  K. 

•  Let  K  be  a  held  and  E  any  set  at  all.  Dehne  V  to  be  the  set  of  functions  f  :  E  K. 
Given  /,  g  G  V  and  A  G  K  one  can  dehne  the  sum  f  T  g  and  scalar  product  A  f  via 

(/  +  9){x)  =  f{x)  +  g(x)  and  (A  •  /)( x)  =  A  •  f{x). 

We  leave  the  reader  the  simple  task  of  checking  that  this  is  a  vector  space. 

•  The  set  of  all  continuous  functions  /  :  M  -A  M  is  a  vector  space  over  M.  This  follows  from 
the  fact  that  if  /  and  g  are  continuous  then  so  are  /  +  g  and  A  •  /  for  any  A  G  M.  Similarly 
the  set  of  all  differentiable  functions  /  :  M  -A  M  also  forms  a  vector  space. 

Vector  Sub-spaces:  Let  V  be  a  iL-vector  space  and  let  W  be  a  subset  of  V.  W  is  said  to  be  a 
vector  subspace  (or  just  subspace)  of  V  if 

(1)  W  is  a  subgroup  of  V  with  respect  to  addition. 

(2)  W  is  closed  under  scalar  multiplication. 

By  this  last  condition  we  mean  A  •  x  G  W  for  all  x  G  W  and  all  A  G  K.  What  this  means  is  that 

a  vector  subspace  is  a  subset  of  V  which  is  also  a  vector  space  with  respect  to  the  same  addition 

and  multiplication  laws  as  V.  There  are  always  two  trivial  subspaces  of  a  space,  namely  {0}  and 
V  itself.  Here  are  some  more  examples: 

•  V  =  Kn  and  W  =  {(6,  ..,U^n:(n  =  0}. 

•  V  =  Kn  and  W  =  {(6,  •  •  • ,  fn)  G  Kn  :  &  +  •  •  •  +  £n  =  0}. 

•  V  =  K[X]  and  W  =  {/  G  K[X]  :  /  =  0  or  deg  /  <  10}. 

•  C  is  a  natural  vector  space  over  Q,  and  M  is  a  vector  subspace  of  C. 

•  Let  V  denote  the  set  of  all  continuous  functions  from  M  to  M  and  W  the  set  of  all  differ¬ 
entiable  functions  from  M  to  M.  Then  IT  is  a  vector  subspace  of  V. 

Properties  of  Elements  of  Vector  Spaces:  Before  we  go  any  further  we  need  to  dehne  certain 
properties  which  sets  of  elements  of  vector  spaces  can  possess.  For  the  following  definitions  let  V 
be  a  iL-vector  space  and  let  xi, . . . ,  xn  and  x  denote  elements  of  V. 

Definition  100.55  (Linear  Independence).  We  have  the  following  definitions  related  to  linear 
independence  of  vectors. 

•  x  is  said  to  be  a  linear  combination  o/x i, . . .  ,xn  if  there  exists  scalars  Xi  G  K  such  that 

x  =  Ai  •  xi  H - b  An  •  xn. 

•  The  elements  xi , . . . ,  xn  are  said  to  be  linearly  independent  if  the  relation 

Ai  *  Xi  ~b  •  •  •  ~b  An  •  xn  —  0 

implies  that  Ai  =  •  •  •  =  An  =  0.  If  xi, . . .  ,xn  are  not  linearly  independent  then  they  are 
said  to  be  linearly  dependent. 

•  A  subset  A  of  a  vector  space  is  linearly  independent  or  free  if  whenever  xi , . . . ,  xn  are 
finitely  many  elements  of  A,  they  are  linearly  independent. 

•  A  subset  A  of  a  vector  space  V  is  said  to  span  (or  generate)  V  if  every  element  ofV  is  a 
linear  combination  of  finitely  many  elements  from  A. 

•  If  there  exists  a  finite  set  of  vectors  spanning  V  then  we  say  that  V  is  finite- dimensional. 
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We  now  give  some  examples  of  the  last  concept. 

•  The  vector  space  V  =  Kn  is  finite-dimensional.  Since  if  we  let 

e;  =  (0, 0,1,0, .  ..,0) 

be  the  n-tuple  with  1  in  the  ith  place  and  0  elsewhere,  then  V  is  spanned  by  the  vectors 
ei, . . . ,  en.  Note  the  analogy  with  the  geometric  plane. 

•  C  is  a  finite-dimensional  vector  space  over  M,  and  {1,  is  a  spanning  set. 

•  M  and  C  are  not  finite-dimensional  vector  spaces  over  Q.  This  is  obvious  since  Q  has 
countably  many  elements,  so  any  finite-dimensional  subspace  over  Q  will  also  have  count¬ 
ably  many  elements.  However  it  is  a  basic  result  in  analysis  that  both  M  and  C  have 
uncount  ably  many  elements. 

Now  some  examples  about  linear  independence: 

•  In  the  vector  space  V  =  Kn  the  n  vectors  ei, . . . ,  en  defined  earlier  are  linearly  indepen¬ 
dent. 

•  In  the  vector  space  M3  the  vectors  xi  =  (1,2,3),  X2  =  (—1,0,4)  and  X3  =  (2,5,— 1)  are 
linearly  independent. 

•  On  the  other  hand,  the  vectors  yi  =  (2,4, —3),  y2  =  (1,1,2)  and  y3  =  (2,8,— 17)  are 
linearly  dependent  as  we  have  3  •  yi  —  4  •  y2  —  y3  =  0. 

•  In  the  vector  space  (and  ring)  X[X]  over  the  held  K  the  infinite  set  of  vectors 

is  linearly  independent. 

Dimension  and  Bases: 

Definition  100.56  (Basis).  A  subset  A  of  a  vector  space  V  which  is  linearly  independent  and 
spans  the  whole  of  V  is  called  a  basis. 

Given  a  basis,  each  element  in  V  can  be  written  in  a  unique  way:  for  suppose  xi, . . . ,  xn  is  a  basis 
and  we  can  write  x  as  a  linear  combination  of  the  x^  in  two  ways  i.e.  x  =  Ai  •  xi  +  •  •  •  +  An  •  xn  and 
x  =  fii  •  xi  +  •  •  •  +  pLn  •  xn.  Then  we  have 

0  =  X  —  X  =  (Ai  -  m)  ■  xi  -t - f  (A„  -  Hn)  ■  x„ 

and  as  the  x^  are  linearly  independent  we  obtain  A i  —  fii  —  0,  i.e.  \  —  pi.  We  have  the  following 
examples. 

•  The  vectors  ei, . . .  ,en  of  Kn  introduced  earlier  form  a  basis  of  Kn.  This  basis  is  called 
the  standard  basis  of  Kn . 

•  The  set  {1,  i}  is  a  basis  of  the  vector  space  C  over  M. 

•  The  infinite  set  {1,  X,  X2,  X2, . . .}  is  a  basis  of  the  vector  space  K[X\. 

By  way  of  terminology  we  call  the  vector  space  V  =  {0}  the  trivial  or  zero  vector  space.  All 
other  vector  spaces  are  called  non-zero.  To  make  the  statements  of  the  following  theorems  easier 
we  shall  say  that  the  zero  vector  space  has  the  basis  set  0. 

Theorem  100.57.  Let  V  be  a  finite- dimensional  vector  space  over  a  field  K.  Let  C  be  a  finite 
subset  of  V  which  spans  V  and  let  A  be  a  subset  of  C  which  is  linearly  independent.  Then  V  has 
a  basis ,  B,  such  that  A  C  B  C  C . 

Proof.  We  can  assume  that  V  is  non-zero.  Consider  the  collection  of  all  subsets  of  C  which 
are  linearly  independent  and  contain  A.  Certainly  such  subsets  exist  since  A  is  itself  an  example. 
So  choose  one  such  subset  B  with  as  many  elements  as  possible.  By  construction  B  is  linearly 
independent.  We  now  show  that  B  spans  V. 

Since  C  spans  V  we  only  have  to  show  that  every  element  x  E  C  is  a  linear  combination  of 
elements  of  B.  This  is  trivial  when  xG-Bso  assume  that  x  0  B.  Then  B'  —  B  U  {x}  is  a  subset 
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of  C  larger  than  B ,  whence  B'  is  linearly  dependent,  by  choice  of  B.  If  xi, . . . ,  xr  are  the  distinct 
elements  of  B  this  means  that  there  is  a  linear  relation 

Ai  •  xi  +  •  •  •  +  Ar  •  xr  +  A  •  x  =  0, 

in  which  not  all  the  scalars,  A^,  A,  are  zero.  In  fact  A  /  0,  otherwise  B  would  consist  of  linearly 
dependent  vectors.  So  we  may  rearrange  to  express  x  as  a  linear  combination  of  elements  of  5,  as 
A  has  an  inverse  in  K .  □ 

Corollary  100.58.  Every  finite- dimensional  vector  space  V  has  a  basis. 

Proof.  We  can  assume  that  V  is  non-zero.  Let  C  denote  a  finite  spanning  set  of  V  and  let  ^4  =  0 
and  then  apply  the  above  theorem.  □ 

The  last  theorem  and  its  corollary  are  true  if  we  drop  the  assumption  of  finite-dimension. 
However  then  we  require  much  more  deep  machinery  to  prove  the  result.  The  following  result  is 
crucial  to  the  study  of  vector  spaces  as  it  allows  us  to  define  the  dimension  of  a  vector  space.  One 
should  think  of  the  dimension  of  a  vector  space  as  the  same  as  the  dimension  of  the  2-D  or  3-D 
space  one  is  used  to. 

Theorem  100.59.  Suppose  a  vector  space  V  contains  a  spanning  set  of  m  elements  and  a  linearly 
independent  set  of  n  elements.  Then  m>n. 

Proof.  Let  A  =  {xi,...,xm}  span  V,  and  let  B  =  {yi,...,yn}  be  linearly  independent  and 
suppose  that  m  <  n.  Hence  we  wish  to  derive  a  contradiction. 

We  successively  replace  the  xs  by  the  ys,  as  follows.  Since  A  spans  V,  there  exists  scalars 
Ai, . . . ,  Am  such  that 

yi  =  T  •  ■  *  T  A m  *  xm. 

At  least  one  of  the  scalars,  say  Ai,  is  non-zero  and  we  may  express  xi  in  terms  of  yi  and  X2, . . . ,  xm. 
It  is  then  clear  that  A\  =  {yi,  X2, . . . ,  xm}  spans  V. 

We  repeat  the  process  m  times  and  conclude  that  Am  =  {yi,...,ym}  spans  V.  (One  can 
formally  dress  this  up  as  induction  if  one  wants  to  be  precise,  which  we  will  not  bother  with.) 

By  hypothesis  m  <  n  and  so  Am  is  not  the  whole  of  B  and  ym+i  is  a  linear  combination  of 
yi, . . . ,  y m,  as  Am  spans  V.  This  contradicts  the  fact  that  B  is  linearly  independent.  □ 

Let  V  be  a  finite-dimensional  vector  space.  Suppose  A  is  a  basis  of  m  elements  and  B  a  basis 
of  n  elements.  By  applying  the  above  theorem  twice  (once  to  A  and  B  and  once  to  B  and  A)  we 
deduce  that  m  =  n.  From  this  we  conclude  the  following  theorem. 

Theorem  100.60.  Let  V  be  a  finite- dimensional  vector  space.  Then  all  bases  of  V  have  the  same 
number  of  elements;  we  call  this  number  the  dimension  ofV  (written  dimP ). 

It  is  clear  that  dimKn  =  n.  This  agrees  with  our  intuition  that  a  vector  with  n  components 
lives  in  an  n-dimensional  world,  and  that  dimM3  =  3.  Note  that  when  referring  to  dimension  we 
sometimes  need  to  be  clear  about  the  held  of  scalars.  If  we  wish  to  emphasize  the  held  of  scalars 
we  write  dim^  P.  This  can  be  important,  for  example  if  we  consider  the  complex  numbers  we  have 

dime  C  =  1,  dim^  C  =  2,  dimQ  C  =  oo. 

The  following  results  are  left  as  exercises. 

Theorem  100.61.  IfV  is  a  (non- zero)  finite- dimensional  vector  space,  of  dimension  n,  then 

(1)  Given  any  linearly  independent  subset  A  ofV,  there  exists  a  basis  B  such  that  A  C  B. 

(2)  Given  any  spanning  set  C  ofV,  there  exists  a  basis  B  such  that  B  C  C . 

(3)  Every  linearly  independent  set  in  V  has  <  n  elements. 

(4)  If  a  linearly  independent  set  has  exactly  n  elements  then  it  is  a  basis. 

(5)  Every  spanning  set  has  >  n  elements. 

(6)  If  a  spanning  set  has  exactly  n  elements  then  it  is  a  basis. 
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Theorem  100.62.  Let  W  be  a  subspace  of  a  finite- dimensional  vector  space  V.  Then  dim  IT  < 
dimT;  with  equality  holding  if  and  only  ifW  =  V. 
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